Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What does adminclient api in kafka?
When you should use Hbase?
Explain the hdfs architecture?
Why do we need Pig?
How does gossip protocol work?
What are the benefits yarn brings in to hadoop?
How Pig programming gets converted into MapReduce jobs?
State some Ambari components which we can use for automation as well as integration?
Explain how indexing in hdfs is done?
What are the port numbers of namenode, job tracker and task tracker?
How to write a custom partitioner for a Hadoop MapReduce job?
Mention how can you stop a partition form being queried?
What is a keyspace in Cassandra?
What are combiners and its purpose?
What are the abstractions of Apache Spark?