Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What are the advantage of spark?
Mention what are views in Hive?
Explain write ahead log(journaling) in spark?
Can I do insert … select * into a partitioned table?
Name three data source available in SparkSQL
Which among the two is preferable for the project- Hadoop MapReduce or Apache Spark?
name few other popular column oriented databases like hbase.
Hadoop uses replication to achieve fault tolerance. How is this achieved in Apache Spark?
Illustrate a simple example of the working of MapReduce.
Features of Kafka Stream?
Will various customers write into an hdfs record simultaneously?
What is the role zookeeper plays in a cluster of kafka?
did you maintain the hadoop cluster in-house or used hadoop in the cloud?
What is the difference between sort by and order by in hive?
What is the purpose of exploding in hive?