Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is Sqoop Import? Explain its purpose?
What are the main features of impala?
Who is a 'user' in HDFS?
What is difference between flume and kafka?
What are all stats classes in the org.apache.pig.tools.pigstats package?
Is it possible to run Spark and Mesos along with Hadoop?
Explain what is a task tracker in hadoop?
What are the data components used by Hadoop?
What will happen in case you have not issued the command: ‘set hive.enforce.bucketing=true;’ before bucketing a table in hive in apache hive 0.x or 1.x?
How is HDFS fault tolerant?
Mention what is rack awareness?
What are the use cases of Apache Pig?
What is a rack awareness algorithm and why is it used in hadoop?
How data or file is written into HDFS?
What are spark jobs?