Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) How does pig work?
why should we use 'group' keyword in pig scripts?
What is the difference between dataset and dataframe in spark?
How can you set an arbitrary number of Reducers to be created for a job in Hadoop?
What is the stable version of Hive ?
What is the required action you need to perform if you opt for scheduled maintenance on the cluster nodes?
How can we create znodes?
Is it possible to add a parameter while running a saved job?
Explain the process of spilling in MapReduce?
Explain some important features of hadoop?
What is jmx connector?
Where is rdd stored?
What is a spark context?
Do we need to place 2nd and 3rd data in rack 2 only?
What is the way of creating Avro Schemas?