Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
How is 0xdata's h2o different from apache mahout ?
State the difference between persist() and cache() functions.
What is Mapper? How can we compress Mapper output in Hadoop?
Should we use RAID in Hadoop or not?
Explain the use of tasktracker in the hadoop cluster?
How many types of nosql databases?
Explain tomap function?
Mention what are the data components used by Hadoop?
How can we see only top 15 records from the student.txt out of100 records in the HDFS directory?
What are the components of Hive architecture?
How is RDD in Apache Spark different from Distributed Storage Management?
What are the basic available commands in Hadoop sqoop ?
What is ng in flume?
What is configured in /etc/hosts and what is its role in setting Hadoop cluster?
In which kind of scenarios MapReduce jobs will be more useful than PIG in Hadoop?