Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is ObjectInspector functionality?
What is a reliable and unreliable receiver in Spark?
How would you tackle counting words in several text documents?
Explain what is Cassandra-Cqlsh?
What are the exact differences between reduce and fold operation in Spark?
Explain about the execution pl of a pig script?
or
differentiate between the logical and physical plan of an apache pig script?
Can you explain broadcast variables?
How do I set up flume agent?
Can you explain textinformat?
What is sink in flume?
What is mllib?
If reducers do not start before all mappers finish then why does the progress on mapreduce job shows something like map(50%) reduce(10%)? Why reducers progress percentage is displayed when mapper is not finished yet?
How Cassandra provide High availability feature?
Difference between order by and sort by in Hive?
Explain the difference between COUNT_STAR and COUNT functions in Apache Pig?