Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is a tuple in pig?
Can the balancer be run while Hadoop is in use?
How Sqoop can be used in a Java program?
What are the features of Fully-Distributed mode?
what is partitions in hive?
Should we use RAID in Hadoop or not?
What is a map side join?
On What concept the Hadoop framework works?
What is the difference between a MapReduce InputSplit and HDFS block?
What are tools available to send the streaming data to hdfs?
Define replication factor?
In which kind of scenarios MapReduce jobs will be more useful than PIG in Hadoop?
Can impala do user-defined functions (udfs)?
What is the biggest shortcoming of Spark?
What are the hadoop configuration files at present?