Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Is hadoop based on google mapreduce?
Do we need to place 2nd and 3rd data in rack 2 only?
Do you need to install Spark on all nodes of Yarn cluster while running Spark on Yarn?
Why do we need Hadoop Archives? How is it created?
How is streaming implemented in spark? Explain with examples.
How to setup the local repository manually?
What are the important differences between apache and hadoop?
What are the various components in kafka.
When running Spark applications, is it necessary to install Spark on all the nodes of YARN cluster?
Why do we need spark?
What is an Agent?
Does Pig differ from MapReduce? If yes, how?
Can you explain indexing?
What do you understand by sstabl in cassandra?
What is an accumulator in spark?