Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
For a job in Hadoop, is it possible to change the number of mappers to be created?
What is pair rdd in spark?
What are the storage supported by tajo?
How can you achieve high availability in Apache Spark?
Does spark use yarn?
Why do people use spark?
Can you explain accumulators in apache spark?
Explain the architecture of Hadoop Pig?
Differentiate between the physical plan and logical plan in Pig script?
What is the function of ApplicationMaster?
What is a yaml file in cassandra?
Can you define parquet file?
What is difference between client and cluster mode in spark?
What happens to zk sessions while the cluster is down?
How does apache flume work?