Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is Hadoop HDFS – Hadoop Distributed File System?
How can you implement machine learning in Spark?
What is zookeper?
How to optimize Hadoop MapReduce Job?
Explain the process for starting a kafka server?
Explain the key features of Spark.
What is scala and spark?
What are the tools that are used in ambari monitoring?
Are there any problems which can only be solved by MapReduce and cannot be solved by PIG? In which kind of scenarios MR jobs will be more useful than PIG?
Why not just use zookeeper for everything?
How do you do a file system check in hdfs?
Explain the Constituents of Apache ZooKeeper Architecture?
What is apache spark architecture?
What is a combiner and where you should use it?
In what ways sparksession different from sparkcontext?