Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) If the source data gets updated every now and then, how will you synchronize the data in hdfs that is imported by sqoop?
How you can use Akka with Spark?
What does the Spark Engine do?
What is a row in cassandra? And what are the different elements of it?
How to submit extra files(jars,static files) for MapReduce job during runtime in Hadoop?
What does FOREACH do?
What problem does Apache Pig solve?
What is the Physical plan in pig architecture?
Why is pig used in hadoop?
What is the maximum number of rows in a table?
Clarify the NoSQL Database?
What is heartbeat in hdfs? Explain.
What are the different tasks we can perform managing host using ambari host tab?
Explain how cassandra writes changed data into commitlog?
What do you mean by column family?