Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Why do we use HDFS for applications having large data sets and not when there are lot of small files?
1 2777
What is an offset?
Explain deletion in hbase?
How Cassandra stores data?
How Cassandra provide High availability feature?
Is scala required for spark?
What do you know about the speculative execution?
Explain the differences between a combiner and reducer
How data or file is read in HDFS?
What is InputFormat in Hadoop MapReduce?
How to submit extra files(jars, static files) for Hadoop MapReduce job during runtime?
Does the HDFS go wrong? If so, how?
In mapreduce what is a scarce system resource? Explain?
Define the term thrift
Which method is used to access HFile directly without using HBase?
How can you compare Hadoop and Spark in terms of ease of use?