What is Disk Balancer in Apache Hadoop?
Define a commodity hardware? Does commodity hardware include ram?
Define a task tracker?
Input Split & Record Reader and what they do?
Define a job tracker?
What is difference between regular file system and HDFS?
What is the problem with small files in Apache Hadoop?
What is the use of combiners in the hadoop framework?
What is a namenode? How many instances of namenode run on a hadoop cluster?
Explain why the name ‘hadoop’?
How will you make changes to the default configuration files?
Rack awareness of Namenode?
What is crontab? Explain with suitable example?