Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
List out the difference between textFile and wholeTextFile in Apache Spark?
What do the master class and the output class do?
What is Hive Database?
By Default, how many partitions are created in RDD in Apache Spark?
Is it possible to use same metastore by multiple users, in case of embedded hive?
Mention the common features in Pig and Hive?
What is the logistic regression?
What is hadoop pig?
What is Fault Tolerance in Hadoop HDFS?
Explain about catalog configuration?
Does Pig differ from MapReduce? If yes, how?
What load do concurrent queries produce on the namenode?
Can you define sqoop in hadoop?
What is project tungsten in spark?
How Hive organize the data?