Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Explain what is zookeeper in kafka? Can we use kafka without zookeeper?
What are the primitive data types in Pig?
Replication causes data redundancy then why is is pursued in HDFS?
What is cluster in apache spark?
How to submit extra files(jars,static files) for MapReduce job during runtime in Hadoop?
Is apache spark a framework?
What is a “Distributed Cache” in Apache Hadoop?
By Default, how many partitions are created in RDD in Apache Spark?
Explain various level of persistence in Apache Spark?
What is catalyst framework in spark?
What is the sequence of execution of map, reduce, recordreader, split, combiner, partitioner?
Can you explain sqoop metastore?
What kind of music is flume?
What is a udf?
How to handle record boundaries in Text files or Sequence files in MapReduce InputSplits?