Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is spooldir flume?
What is executor memory in spark?
What is the problem in having lots of small files in hdfs?
How much faster is Apache spark than Hadoop?
What is the function of "MLlib"?
Explain how is data partitioned before it is sent to the reducer if no custom partitioner is defined in hadoop?
Is fs.mapr.working.dir a single directory?
What is the use of cassandra cql collection?
Explain HCatInputFormat?
In MapReduce, ideally how many mappers should be configured on a slave?
List the popular use cases of Apache Spark?
Can we have different replication factor of the existing files in hdfs?
What is troubleshooting for impala?
Can you explain bloommapfile.
Any two Limitations of Flume?