Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Clarify Memtable?
What is the difference between persist
What is spark catalyst?
What is Apache Hive?
What is a primary key? And what are it’s different types?
What are the components of a Hive query processor?
What are the great features of spark sql?
Can you overwrite Hadoop MapReduce configuration in Hive?
What is data pipeline in spark?
explain apache hbase?
explain the concept of RDD (Resilient Distributed Dataset). Also, state how you can create RDDs in Apache Spark.
Why do we need Hadoop Archives? How is it created?
Where are hadoop’s configuration files located and list them?
As part of optimizing the queries in hive, what should be the order of table size in a join query?
Explain what is memtable in cassandra?