Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) How will you calculate the number of executors required to do real-time processing using Apache Spark? What factors need to be considered for deciding on the number of nodes for real-time processing?
291In a given spark program, how will you identify whether a given operation is Transformation or Action ?
347
Describe what happens to a mapreduce job from submission to output?
Knox and Hadoop Development Tools?
List few differences between apache kafka and rabbitmq?
What happens if the block in HDFS is corrupted?
What are the different Primitive Data Types available in Hive?
How many Reducers run for a MapReduce job?
What is the maximum number of rows in a table?
Explain first() operation in Apache Spark?
Have you ever used counters in hadoop?
What is rack-aware replica placement policy?
Clarify Memtable?
What is the Internal Architecture of the Cassandra Database ?
Mention what is the maximum size of the message does kafka server can receive?
Does mapreduce programming model provide a way for reducers to communicate with each other? In a mapreduce job can a reducer communicate with another reducer?
What does streams api in kafka?