Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is the relationship between Hadoop, HBase, Hive and Cassandra ?
Can ambari manage multiple clusters and why?
What is active and passive NameNode in HDFS?
Is HDFS utilized in Cassandra? If yes, where?
Why do we need spark?
What does map transformation do? Provide an example.
What is a 'block' in HDFS?
what does /*streamtable(table_name)*/ do?
What are the log files of the presto server?
What size is recommended for each node?
Can you explain how to minimize data transfers while working with Spark?
What is pagerank in graphx?
What is the man difference between hbase and hive?
How is a keyspace created in cassandra?
What is HBase HMaster?