Why do we use HDFS for applications having large data sets and not when there are lot of small files?
Why do we use HDFS for applications having large data sets and not when there are lot of small files?
Command to format the NameNode?
Knox and Hadoop Development Tools?
What is NoSQL?
Explain the basic difference between traditional rdbms and hadoop?
What are the modules that constitute the Apache Hadoop 2.0 framework?
Define a daemon?
What is crontab? Explain with suitable example?
How to keep HDFS cluster balanced?
Explain a simple Map/Reduce problem.
shouldn't DFS be able to handle large volumes of data already?
What is a block and block scanner in HDFS?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)