Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is driver and executor in spark?
What is spark in python?
What is meant by in-memory processing in Spark?
What is the importance of — the split-by clause in running parallel import tasks in sqoop?
What is a bag in apache pig?
Can you explain about the cluster manager of apache spark?
What is different table structure available in the hive?
What does the following query do? Insert overwrite table employees partition (country, state) select ..., Se.cnty, se.st from staged_employees se;
how can you debug Hadoop code?
Does Flume provide 100% reliability to the data flow?
Explain the flatMap operation on Apache Spark RDD?
Explain Apache Ambari?
What is the importance of dfs.namenode.name.dir in HDFS?
How to fetch particular columns in pig?
Is it possible to create cartesian join between 2 tables, using hive?