Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What alternate way does HDFS provides to recover data in case a Namenode, without backup, fails and cannot be recovered?
How does rdd work in spark?
What is a flume agent?
What do you understand by Data Replication in Cassandra?
Is it possible to use Kafka without ZooKeeper?
List commonly used machine learning algorithm?
What does hadoop-metrics.properties file do?
Why do we perform partitioning in Hive?
Explain textloader function?
What is Small File Problem in Hadoop? How can it be resolved?
How many ways can you create rdd in spark?
What is regionserver?
Explain what is wal and hlog in hbase?
What bit version that ambari needs and also list out the operating systems that are compatible?
What is spark used for?