Why is Spark RDD immutable?
What is meant by rdd lazy evaluation?
How is rdd distributed?
How many partitions are created by default in Apache Spark RDD?
What is the difference between spark and scala?
What is the spark driver?
What is the difference between Spark Transform in DStream and map ?
What do you understand by Transformations in Spark?
What do you understand about yarn?
Explain sum(), max(), min() operation in Apache Spark?
What advantages does Spark offer over Hadoop MapReduce?
What are the major features/characteristics of rdd (resilient distributed datasets)?
Why is apache spark so fast?
Why does spark skip stages?
What are the file formats supported by spark?