Define the Use of MapReduce?
What combiners is and when you should use a combiner in a MapReduce Job?
Explain the sequence of execution of all the components of MapReduce like a map, reduce, recordReader, split, combiner, partitioner, sort, shuffle.
How would you tackle calculating the number of unique visitors for each hour by mining a huge apache log? You can use post processing on the output of the mapreduce job.
What is a combiner and where you should use it?
when do reducers play their role in a mapreduce task?
How to change a number of mappers running on a slave in MapReduce?
What is the default input type in MapReduce?
What happens when the node running the map task fails before the map output has been sent to the reducer?
Is it legal to set the number of reducer task to zero? Where the output will be stored in this case?
Mention what is the hadoop mapreduce apis contract for a key and value class?
Why Mapreduce output written in local disk?
What is Data Locality in Hadoop?