Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is the difference between rdd and dataframe in spark?
What is the sequence of execution of map, reduce, recordreader, split, combiner, partitioner?
Describe the distnct(),union(),intersection() and substract() transformation in Apache Spark RDD?
How can you make sure of logical grouping of cells in the hbase?
Explain what is jobtracker in hadoop? What are the actions followed by hadoop?
What is RDD Lineage?
Can there be no Reducer?
How to change from su to cloudera?
Why does my select statement fail?
Mention what is the maximum size of the message does kafka server can receive?
What do sorting and shuffling do?
Explain about tajo worker configuration?
When the reducers are are started in a mapreduce job?
Do we need scala for spark?
What does a split do?