Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is Fault Tolerance in Hadoop HDFS?
What problem does Apache Flume solve?
How would you diagnose or do exception handling in the pig?
Explain reduceByKey() Spark operation?
How will you backup an HBase cluster?
How the HDFS Blocks are replicated?
What is the ZooKeeper ensemble?
Explain the maximum size of a message that can be received by the Kafka?
What do you understand by the parquet file?
What is the difference between apache mahout and apache spark’s mllib?
Explain tokenize?
Can you define serde in hive?
What are the side data distribution techniques?
Difference between hive and impala?
What is the default replication factor in Hadoop and how will you change it?