Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) How does spark work with python?
What are the different Primitive Data Types available in Hive?
What happens when the node running the map task fails before the map output has been sent to the reducer?
What is the difference between Apache Hadoop and RDBMS?
When to avoid secondary indexes?
If I create a folder in HDFS, will there be metadata created corresponding to the folder? If yes, what will be the size of metadata created for a directory?
What is Ambari shell?
What is a namenode? How many instances of namenode run on a hadoop cluster?
How hive can improve performance with orc format tables?
What is dataframe in spark?
How does hadoop achieve fault tolerance?
Explain the key benefits of using storm for real time processing?
Does this lead to security issues?
What is the difference between Primary, Partition and Cassandra ?
What are the particular functionalities of Nagios in Ambari?