Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is the difference between External and Internal Table in Hive?
What is the difference between Input Split and an HDFS Block?
Can you explain the term, Cassandra?
Is cache an action in spark?
what do you mean by the worker node?
Explain the need for MapReduce while programming in Apache Pig?
Why do we need spark?
What is a row in cassandra? And what are the different elements of it?
List out the difference between textFile and wholeTextFile in Apache Spark?
Explain the process to trigger automatic clean-up in Spark to manage accumulated metadata.
What is a generic udf in hive?
Hive new version supported Hadoop Versions ?
What are the major features/characteristics of rdd (resilient distributed datasets)?
In how many ways RDDs can be created? Explain.
How did you debug your Hadoop code ?