Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is the difference between a hadoop database and relational database?
What is difference between flume and kafka?
What are the key benefits of using storm for real time processing?
Explain about transformations and actions in the context of RDDs.
What is the functionality of jobtracker in hadoop?
Mention how many inputsplits is made by a hadoop framework?
Is it possible to provide multiple inputs to hadoop? If yes, explain.
Why is spark popular?
Apache Flume support third-party plugins also?
Define data integrity?
Use of export command in hadoop sqoop?
Where can I get sample data to try?
What is the use of spark driver, where it gets executed on the cluster?
what is the Hadoop MapReduce APIs contract for a key and value class?
Is it mandatory to set input and output type/format in MapReduce?