Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Compare Pig vs Hive vs Hadoop MapReduce?
How can you add the arbitrary key-value pairs in your mapper?
Name the operations supported by rdd?
On what all basis can you differentiate rdd, dataframe, and dataset?
How spark is used in hadoop?
What are the different operational commands in HBase at record level and table level?
What makes Apache Spark good at low-latency workloads like graph processing and machine learning?
How to write a Custom Key Class?
What are the machine learning algorithms supports in apache mahout?
What is the unit of data that flows through a flume agent?
Explain caching in spark streaming.
What Mapper does?
Compare Hadoop and Spark?
Mention what are the main configuration parameters that user need to specify to run mapreduce job?
Why is transformation lazy operation in Apache Spark RDD? How is it useful?