Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What are benefits of Spark over MapReduce?
Can you explain sqoop metastore?
What is Zookeeper Cluster?
How to split single hdfs block into partitions rdd?
Which are the methods to create rdd in spark?
How Cassandra provide High availability feature?
How can you prevent a large job from running for a long time? What do u think is more popular among the developers - Pig or Hive?
What co-group does in Pig?
Hdfs stores data using commodity hardware which has higher chances of failures. So, how hdfs ensures the fault tolerance capability of the system?
What are the core api’s of kafka?
Give the data storage units in Cassandra?
What is the replication factor?
Can spark work without hadoop?
Explain the use of File system API in Apache Spark
Explain Multi-tenancy?