Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is column store db? Explain with an example.
How does a client read/write data in HDFS?
Why HDFS performs replication, although it results in data redundancy in Hadoop?
Explain HCatalog Create Table CLI along with its syntax?
Why we use parallelize in spark?
How much space will the split occupy in Mapreduce?
Explain Spark saveAsTextFile() operation?
Why we need impala hadoop?
What is the default input type in MapReduce?
What are the different elements of jconsole?
What are the barriers?
Explain the Reducer's reduce phase?
What is pre-requisites for contributing to apache mahout ?
What is the difference between mahout and graphlab ?
What is spark in python?