Why do we use HDFS for applications having large data sets and not when there are lot of small files?
1 2341Post New Apache Hadoop Questions
On which port does ssh work?
How can I install Cloudera VM in my system?
How is the distance between two nodes defined in Hadoop?
What is rack-aware replica placement policy?
What is Schema on Read and Schema on Write?
What is speculative execution in Hadoop?
What are the modules that constitute the Apache Hadoop 2.0 framework?
Have you ever used Counters in Hadoop. Give us an example scenario?
What are sink processors?
What does the command mapred.job.tracker do?
How blocks are distributed among all data nodes for a particular chunk of data?
How we can change Replication factor when Data is on the fly?
Explain a simple Map/Reduce problem.
How does NameNode tackle DataNode failures?
What is hbase in hadoop?