What is the process to change the files at arbitrary locations in HDFS?
How indexing is done in HDFS?
How would you use Map/Reduce to split a very large graph into smaller pieces and parallelize the computation of edges according to the fast/dynamic change of data?
Suppose Hadoop spawned 100 tasks for a job and one of the task failed. What will Hadoop do?
what are Task Tracker and Job Tracker?
What happens to a namenode, when job tracker is down?
How is the distance between two nodes defined in Hadoop?
Explain the master class and the output class do?
What is Hadoop serialization?
How do you define "block" in HDFS?
when hadoop enter in safe mode?
What is column families? What happens if you alter the block size of ColumnFamily on an already populated database?
How to enable trash/recycle bin in hadoop?
What do slaves consist of?
What do you know about keyvaluetextinputformat?