Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is OutputCommitter?
Name different types of the data model?
Mention what happens if the preferred replica is not in the ISR?
How to control access to data in impala?
What is difference between map and flatmap?
what is Zookeeper in Kafka? Can we use Kafka without Zookeeper?
What is pair rdd in spark?
What is the difference between HDFS block and input split?
how Cassandra writes data?
On what basis Namenode will decide which datanode to write on?
What do you understand from Node redundancy and is it exist in hadoop cluster?
what is the traditional method of message trfer?
Can Apache Kafka be used without Zookeeper?
When to use Cassandra?
How can you control the number of mappers used by the sqoop command?