Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Explain join() operation in Apache Spark?
What is Apache Spark Machine learning library?
Write a Mapreduce Program for Character Count ?
How can we create RDD in Apache Spark?
What is the benifit of Distributed cache, why can we just have the file in HDFS and have the application read it?
What are the ways in which Apache Spark handles accumulated Metadata?
What problems have you faced when you are working on Hadoop code?
What are the differences between hadoop 1 and hadoop 2?
What is a mapreduce algorithm?
How is NFS different from HDFS?
How would an hadoop administrator deploy various components of hadoop in production?
What is Pig Statistics? What are all stats classes in the Java API package available?
What Are Good Use Cases For Impala As Opposed To Hive Or MapReduce?
What is the role of the zookeeper?
How is HDFS fault tolerant?