Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What alternate way does HDFS provides to recover data in case a Namenode, without backup, fails and cannot be recovered?
39If I create a folder in HDFS, will there be metadata created corresponding to the folder? If yes, what will be the size of metadata created for a directory?
42
What is a primary key? And what are it’s different types?
Clarify Memtable?
Mention the difference between hbase and relational database?
What is interactive mode in apache pig?
Explain JobConf in MapReduce.
What is map/reduce job in hadoop?
What do sorting and shuffling do?
Explain the term 'Topic Replication Factor'?
Why Mapper runs in heavy weight process and not in a thread in MapReduce?
What is the communication channel between client and namenode/datanode?
What is the difference between a Hadoop and Relational Database and Nosql?
What does hadoop-env.sh do?
Why Flume?
Explain the functionalities of ganglia in ambari?
What is difference between hive and spark?