Which operating system(s) are supported for production hadoop deployment?
What are some of the characteristics of Hadoop framework?
Which modes can Hadoop be run in? List a few features for each mode?
Is it possible to have hadoop job output in multiple directories? If yes, how?
What are the side data distribution techniques?
Define “speculative execution” in hadoop?
What is difference between reducer and combiner?
What are the side effects of not running a secondary name node?
What is the difference between an inputsplit and a block?
Which directory does hadoop install to?
What are the most common OutputFormat in Hadoop?
What happen when namenode enters in safemode in hadoop?
Hadoop achieves parallelism by dividing the tasks across many nodes, it is possible for a few slow nodes to rate-limit the rest of the program and slow down the program. What mechanism Hadoop provides to combat this?
What is Slot in Hadoop v1? Why was it removed from Hadoop v2?
What is the logistic regression?