List the configuration parameters that have to be specified when running a MapReduce job.
Explain the difference between a MapReduce InputSplit and HDFS block?
What is the need of MapReduce?
what is "map" and what is "reducer" in Hadoop?
Where sorting is done on mapper node or reducer node in MapReduce?
Clarify what is shuffling in map reduce?
What does a 'MapReduce Partitioner' do?
what are the main configuration parameters that user need to specify to run Mapreduce Job ?
What is the data storage component used by Hadoop?
What are the various configuration parameters required to run a mapreduce job?
Explain the features of Apache Spark because of which it is superior to Apache MapReduce?
What is a Distributed Cache in Hadoop?
What is a IdentityMapper and IdentityReducer in MapReduce ?