Why do we use HDFS for applications having large data sets and not when there are lot of small files?
Define a datanode?
What does the command mapred.job.tracker do?
Explain InputFormat?
What is HDFS - Hadoop Distributed File System?
What do you know about keyvaluetextinputformat?
Why is hadoop faster?
Rack awareness of Namenode?
What are the different methods to run Spark over Apache Hadoop?
Can we call vms as pseudos?
How to enable recycle bin or trash in hadoop?
Can hive run without hadoop?
What is oozie in hadoop?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)