Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Why do we use HDFS for applications having large data sets and not when there are lot of small files?
1 2842
What is a Column family in hbase?
Mention how many inputsplits is made by a hadoop framework?
What is the usage of "cqlsh-version" command?
Explain why do we need hadoop?
What is rdd map?
What is 'Key value pair' in HDFS?
How do I achieve fifo behavior with kafka?
Differentiate between FileSink and FileRollSink?
Explain apache spark streaming? How is the processing of streaming data achieved in apache spark?
Explain sortbykey() operation?
How does rdd work in spark?
Define Partition in Apache Spark?
What main configuration parameters are specified in mapreduce?
What is the difference between dataframe and dataset in spark?
Explain Drop View Statement along with syntax?