Explain the run-time architecture of Spark?
Explain the general mapreduce algorithm
What do you know about the speculative execution?
Hadoop Libraries and Utilities and Miscellaneous Hadoop Applications?
What is the difference between apache mahout and spark mllib ?
What happen if one of the datanodes has much slower cpu?
Explain the operations of Apache Spark RDD?
Explain Dsstream with reference to Apache Spark
What is the function of mapreducer partitioner?
Which storage level does the cache () function use?
What is ColumnFamily?
What is a hive on spark?
Can you define inputsplit in hadoop?
What is the current version of Hive?
what are Task Tracker and Job Tracker?