What is the role of recordreader in hadoop mapreduce?
What happens when the node running the map task fails before the map output has been sent to the reducer?
Define speculative execution?
Is it legal to set the number of reducer task to zero? Where the output will be stored in this case?
What are the advantages of using map side join in mapreduce?
What is a map side join?
What is a combiner and where you should use it?
When should you use sequencefileinputformat?
What is the purpose of textinputformat?
What is reduce side join in mapreduce?
What do you mean by inputformat?
What are the various configuration parameters required to run a mapreduce job?
What is a distributed cache in mapreduce framework?
What do you mean by data locality?
How can we assure that the values regarding a particular key goes to the same reducer?