Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Does impala performance improve as it is deployed to more hosts in a cluster in much the same way that hadoop performance does?
164What is RDD in Apache Spark? How are they computed in Spark? what are the various ways in which it can create?
297What role does worker node play in Apache Spark Cluster? And what is the need to register a worker node with the driver program?
332
How kafka communicate with clients and servers?
Specify the partitions in hive?
List the steps in which Cassandra writes changed data into commitlog?
What is Text Input Format?
How can you check all the tables present in a single database using Sqoop?
How can you connect an application
What are producer-consumer queues?
What are the limitations of the Pig?
What does it mean by Columnar Storage Format?
Explain why the name ‘hadoop’?
What is mapreduce algorithm?
Mention what does the shell commands “capture” and “consistency” determines?
What is the role of the ZooKeeper in Kafka?
What is a commodity hardware? Does commodity hardware include RAM?
What do you know about the speculative execution?