Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Why HDFS stores data using commodity hardware despite the higher chance of failures?
What is NoSQL database?
Explain the difference between nas and hdfs?
How Spark uses Hadoop?
When is it not recommended to use MapReduce paradigm for large
I have a row or key cache hit rate of 0.XX123456789 reported by JMX. Is that XX% or 0.XX% ?
What is the use of spark sql?
How do I achieve fifo behavior with kafka?
What is an Agent?
How rdd persist the data?
Mention what is data cleansing?
Clarify what combiners are and when you should utilize a combiner in a map reduce job?
Do I need scala for spark?
What are the differences between Caching and Persistence method in Apache Spark?
What is Starvation scenario in spark streaming?