Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is the key- value pair in MapReduce?
What is spark vs hadoop?
What is the usefulness of the distributed by clause in hive?
What is kafka in hadoop?
What are the benefits/ advantages of Cassandra?
What is the use of “resultset execute” method?
What is the purpose of ‘dump’ keyword in Pig?
What is throughput? How does HDFS provide good throughput?
What is cqlsh? And why is it used?
Which command do we use to run HBase Shell?
What are the three types of tombstone markers in hbase?
Why HDFS stores data using commodity hardware despite the higher chance of failures in hadoop?
What OS Cassandra supports?
What is lineage graph?
What are the main methods of data transferring in hadoop sqoop?