Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is presto verifier?
How do you set up a spark?
What are the debugging tools used for Apache Pig scripts?
Explain bagtotuple?
What are the features of Standalone (local) mode?
What is an input reader in reference to mapreduce?
Explain the maximum size of a message that can be received by the Kafka?
Explain when using field grouping in storm, is there any time-out or limit to known field values?
Define data centre?
What are tools available to send the streaming data to hdfs?
Explain how cassandra writes changed data into commitlog?
Explain the features of Apache Spark because of which it is superior to Apache MapReduce?
Is spark a language?
Does spark work with python 3?
What do you know about sequencefileinputformat?