Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) How does job tracker schedule a job for the task tracker?
How can we create children / sub-znode?
What are the languages supported by apache spark and which is the most popular one?
Explain the process to trigger automatic clean-up in Spark to manage accumulated metadata.
What happens if you alter the block size of a column family on an already occupied database?
Explain the Use of Hive?
What is sink processors?
Give a list of Collection data type in Cassandra?
Is reduce-only job possible in Hadoop MapReduce?
What is lineage graph in spark?
On what all basis can you differentiate rdd, dataframe, and dataset?
Is it possible to add or delete column families in a working group?
What are the key benefits of using storm for real time processing?
What is isr?
What do you know about the speculative execution?