Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Is Hive useful when making data warehouse applications?
What is active and passive NameNode in Hadoop?
Describe what happens to a mapreduce job from submission to output?
What is a reliable and unreliable receiver in Spark?
What exactly kafka does?
What is ganglia is used for in ambari?
Why ‘Reading‘ is done in parallel and ‘Writing‘ is not in HDFS?
Why we need compression and what are the different compression format supported?
What is Hadoop Map Reduce ?
What is node?
Explain catalyst query optimizer in Apache Spark?
Explain Thrift & Protocol Buffers Vs. Avro?
What is Apache Spark Machine learning library?
Do I need scala for spark?
What are the Benefits Of Distributed Applications?