Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
How to load data in pig?
Define the consistency levels for read operations in Cassandra?
what is JobTracker in Hadoop? What are the actions followed by Hadoop?
what is NameNode in Hadoop?
What is the use of ZooKeeper?
How do ‘map’ and ‘reduce’ work?
Why is Apache Spark faster than Apache Hadoop?
What are the basic steps to writing a UDF Function in Pig?
Is impala intended to handle real time queries in low-latency applications or is it for ad hoc queries for the purpose of data exploration?
Explain what is namenode in hadoop?
How will you list all the columns of a table using Apache Sqoop?
Can multiple clients write into an HDFS file concurrently in hadoop?
What is DStream in Apache Spark Streaming?
Does Hoe Spark handle monitoring and logging in Standalone mode?
What is Streaming / Log Data?