Big Data Interview Questions
Questions Answers Views Company eMail

How many ways we can create rdd?

198

What does repartition do in spark?

197

What is the driver program in spark?

182

What is spark submit?

188

How do I clear my spark cache?

178

What is a partition in spark?

213

What is spark vectorization?

188

What is off heap memory in spark?

182

What is a tuple in spark?

193

Is spark an etl?

190

How is rdd distributed?

199

What are the common transformations in apache spark?

186

What is the difference between dataset and dataframe in spark?

221

What is distributed cache in spark?

201

What is catalyst framework in spark?

190


Un-Answered Questions { Big Data }

Elaborate on cassandra - cql?

49


Is impala production ready?

33


What are the various programming languages supported by Spark?

233


Explain Apache Ambari?

41


How can we have to see all the clusters that are available in ambari?

44






If the hadoop administrator needs to make a change, which configuration file does he need to change?

247


How can you send large messages with kafka (over 15mb)?

306


What is a "Spark Driver"?

204


What is the connection between hadoop and big data?

210


Are Cassandra, Hadoop, Hbase and Cassandra are the same in nature? Specify.

43


Is there an easy way to expire a session for testing?

1


Explain the rudimentary difference between Cassandra and HBase?

55


Compare hbase vs hdfs?

28


What is partitioning in MapReduce?

407


List commonly used machine learning algorithm?

186