Big Data Interview Questions
Questions Answers Views Company eMail

List some commonly used Machine Learning Algorithm Apache Spark?

186

What is the command to start and stop the Spark in an interactive shell?

207

List out the ways of creating RDD in Apache Spark?

194

What are the various advantages of DataFrame over RDD in Apache Spark?

193

What is flatmap in apache spark?

205

What is the standalone mode in spark cluster?

164

Explain apache spark streaming? How is the processing of streaming data achieved in apache spark?

192

In what ways sparksession different from sparkcontext?

238

Explain fold() operation in spark?

200

Define sparkcontext in apache spark?

190

List out the various advantages of dataframe over rdd in apache spark?

194

What is map in apache spark?

184

Write the command to start and stop the spark in an interactive shell?

187

Define various running modes of apache spark?

191

What are the ways to run spark over hadoop?

183


Un-Answered Questions { Big Data }

Explain REPEAT function in Hive with example?

491


What is the definition of Hive?

455


Why Should we use Apache Kafka Cluster?

317


What do you mean by column family?

41


How much does flume cost?

54






Is reduce-only job possible in Hadoop MapReduce?

396


Define the use of Source Command in Cassandra?

67


What is hotspotting in hbase?

125


What is the problem in having lots of small files in hdfs?

32


What are impala built-in functions?

42


Explain sortbykey() operation?

206


What is Immutable?

230


How to create a user in Hadoop?

226


Can sqoop use spark?

1


Explain the CLI In Zookeeper?

5