Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is spark flatmap?
What are the primitive data types in Pig?
What are the prime features of apache zookeeper?
What is bloom filter?
Give the data storage units in Cassandra?
What does FOREACH do?
Explain about the common workflow of a Spark program?
What are the storage supported by tajo?
How to enable recycle bin in hadoop?
Explain how does hbase actually delete a row?
Explain how you can get exactly once messaging from kafka during data production?
Mention what is the benefits of apache kafka over the traditional technique?
Differentiate between Pig Latin and Pig Engine?
How can you set an arbitrary number of Reducers to be created for a job in Hadoop?
What are the relational operators available related to loading and storing in pig language?