On what basis name node distribute blocks across the data nodes?
what is SPF?
Define a datanode?
Knox and Hadoop Development Tools?
What problems have you faced when you are working on Hadoop code?
What is the difference between SQL and NoSQL?
What are the port numbers of namenode, job tracker and task tracker?
What is small file problem in hadoop?
Explain how do ‘map’ and ‘reduce’ works?
What is the difference between Apache Hadoop and RDBMS?
How can you overwrite the replication factors in HDFS?
Why do we use HDFS for applications having large data sets and not when there are lot of small files?
What is DistributedCache and its purpose?