Why do we use HDFS for applications having large data sets and not when there are lot of small files?
Explain the shuffle?
Is hadoop based on google mapreduce?
Define a namenode?
What is the default block size in hdfs?
what is difference between int and intwritable?
What does ‘jps’ command do?
What are the network requirements for using hadoop?
What is the InputFormat ?
What is Row Key?
How we can take Hadoop out of Safe Mode?
Which are the three main hdfs-site.xml properties?
What is Rack awareness?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)