Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is apache hcatalog?
What is the abstraction of Spark Streaming?
What is a rack awareness algorithm and why is it used in hadoop?
what are the three modes in which Hadoop can be run?
What is simple strategy?
Explain the level of parallelism in spark streaming?
How is machine learning implemented in spark?
What is the most widely recognized info formats characterized in hadoop?
What is a mapreduce algorithm?
Can we change the body of the flume event?
What is the use of exists command?
Define the use of Source Command in Cassandra?
What is the importance of driver in hive?
What do you mean by taskinstance?
What kinds of impala queries or data are best suited for hbase?