Explain task granularity
What is Hadoop Map Reduce ?
Clarify what is shuffling in map reduce?
How to overwrite an existing output file during execution of mapreduce jobs?
When the reducers are are started in a mapreduce job?
What is a scarce system resource?
What is the difference between Job and Task in MapReduce?
How to optimize MapReduce Job?
How is reporting controlled in hadoop?
What is the sequence of execution of map, reduce, recordreader, split, combiner, partitioner?
In MapReduce Data Flow, when Combiner is called?
What is the problem with the small file in Hadoop?
How will you submit extra files or data ( like jars, static files, etc. ) For a mapreduce job during runtime?