Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Why do we use HDFS for applications having large data sets and not when there are lot of small files?
1 2676
What is a databricks cluster?
What are the main features of impala?
What happen if the number of the reducer is 0 in MapReduce?
Some of the most notable applications of Kafka?
Explain the process to trigger automatic clean-up in Spark to manage accumulated metadata.
What are the differences between Caching and Persistence method in Apache Spark?
Can We Change settings within Hive Session? If Yes, How?
What are the features of spark rdd?
Can we change the file cached by distributed cache
What the information segments utilized by hadoop are?
What are Replication Tool and its types?
Explain data flow in Flume?
How do you specify the table creator name when creating a table in hive?
List the functions of Spark SQL?
How is indexing done in Hadoop HDFS?