Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is the utilization of hcatalog?
Do you need to install spark on all nodes of yarn cluster?
What is the input type/format in MapReduce by default?
List some use cases of apache kafka?
Name the ports Cassandra uses?
How many filters are available in HBase?
what is Memtable in Cassandra?
State some Ambari components which we can use for automation as well as integration?
What is a shuffle block in spark?
What is troubleshooting for impala?
What is difference between rdd and dataframe?
What is the difference between sort by and order by in hive?
Explain task granularity
What are the uses and applications of mahout ?
Which operating system(s) are supported for production hadoop deployment?