Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Explain what is the function of mapreduce partitioner?
What are the benefits of apache kafka over the traditional technique?
How can you native libraries be included in yarn jobs?
Explain foreach() operation in apache spark?
When to use coalesce and repartition in spark?
What is the difference between map and flatmap?
What is a sqoop metastore?
What is vectorized query execution?
Define actions in spark.
What is the role of Spark Driver in spark applications?
What are the most common OutputFormat in Hadoop?
What is a Task instance in Hadoop? Where does it run?1
What are the additional benefits YARN brings in to Hadoop?
What are the various configuration parameters required to run a mapreduce job?
Explain partitions?