Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Clarify what jobtracker is in hadoop? What are the activities followed by hadoop?
Mention what are the most common input formats defined in hadoop?
What does ambari shell can provide?
What is shuffle spill in spark?
Tell any two features of flume?
What types of costs are associated in creating index on hive tables?
What is spark vs hadoop?
Explain is it possible to search for files using wildcards?
How do reducers communicate with each other?
How to write a custom partitioner for a Hadoop MapReduce job?
What is a spill factor with respect to the ram?
Can you explain difference between apache mahout and apache spark’s mllib?
Define the consistency levels for read operations in Cassandra?
Suppose Hadoop spawned 100 tasks for a job and one of the task failed. What will Hadoop do?
Can you explain apache ambari?