How to resolve small file problem in hdfs?
What is formatting of the dfs?
what is meaning Replication factor?
If a data Node is full how it's identified?
How to change from su to cloudera?
How would you use Map/Reduce to split a very large graph into smaller pieces and parallelize the computation of edges according to the fast/dynamic change of data?
What are the two main components of ResourceManager?
What is the difference between rdbms and hadoop?
What stored in HDFS?
Have you ever used Counters in Hadoop. Give us an example scenario?
What is Safemode in Apache Hadoop?
What is MapFile?
Which are the two types of 'writes' in HDFS?