What is a mapreduce algorithm?
What do you know about nlineinputformat?
how Hadoop is different from other data processing tools?
Explain the difference between a MapReduce InputSplit and HDFS block?
Mention what are the main configuration parameters that user need to specify to run mapreduce job?
How to set the number of reducers?
Explain task granularity
How to submit extra files(jars, static files) for MapReduce job during runtime?
What is the role of a MapReduce partitioner?
Is it important for Hadoop MapReduce jobs to be written in Java?
What is identity mapper and reducer? In which cases can we use them?
what happens when Hadoop spawned 50 tasks for a job and one of the task failed?
Explain what does the conf.setMapper Class do in MapReduce?