Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
How Hadoop’s CLASSPATH plays a vital role in starting or stopping in Hadoop daemons?
Is ambari python client can be used to make good use of ambari api’s?
What is rdd in spark with example?
Explain about the execution plans of a Pig Script? Or Differentiate between the logical and physical plan of an Apache Pig script?
What are broadcast variables in Apache Spark? Why do we need them?
What is the role of data transfer API in HCatalog?
Why does hive not store metadata information in hdfs?
In the Producer, when does QueueFullException occur?
What are the main components of a Hadoop Application?
What are the different types of partitioners in cassandra?
Differentiate between the terms: node, a cluster, and data center in cassandra?
List the configuration parameters that have to be specified when running a MapReduce job.
Why do we need MapReduce during Pig programming?
What does it indicate if replica stays out of ISR for a long time?
How does one create RDDs in Spark?