Big Data Interview Questions
Questions Answers Views Company eMail

What are shared variables in spark?

213

What is the future of apache spark?

195

How can I improve my spark performance?

188

What is apache spark architecture?

216

Why spark is faster than hive?

188

What happens if rdd partition is lost due to worker node failure?

306

What is pair rdd in spark?

200

What is difference between cache and persist in spark?

192

Is bigger than spark driver maxresultsize?

217

Does spark use java?

201

How do you process big data with spark?

179

What is a spark shuffle?

210

Why do we need apache spark?

191

How do I optimize my spark code?

199

What is the difference between client mode and cluster mode in spark?

205


Un-Answered Questions { Big Data }

Define a namenode?

385


Does cassandra support acid tractions?

50


What is hadoop? Name the main components of a hadoop application?

249


What are the limitations of the Pig?

324


How to create directory in HDFS?

38






What is external shuffle service in spark?

208


What do you know about transformations in spark?

205


What is yarn in hadoop?

410


Can the balancer be run while Hadoop is in use?

675


Explain future growth of Apache Ambari?

49


Would you be able to change the block size of hdfs files?

32


What is the input type/format in MapReduce by default?

767


Explain HDFS “Write once Read many” pattern?

24


What are the all tasks we can perform for managing services using the ambari service tab?

51


Explain fold() operation in spark?

202