What is spark and what is its purpose?
Explain Spark Core?
Can you explain worker node?
What are 4 v's of big data?
Which one will you choose for a project –Hadoop MapReduce or Apache Spark?
What are the ways to run spark over hadoop?
Describe Accumulator in detail in Apache Spark?
How Spark uses Akka?
Which the fundamental data structure of Spark
Define Actions.
What is accumulators and broadcast variables in spark?
Why is transformation lazy operation in Apache Spark RDD? How is it useful?
What are the various data sources available in SparkSQL?
What is Spark MLlib?
Why spark is used?