Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Who is intended audience to learn HCatalog?
Is impala intended to handle real time queries in low-latency applications or is it for ad hoc queries for the purpose of data exploration?
Name the filter which accepts the page size as the parameter in hbase?
If datanodes increase, then do we need to upgrade namenode?
Explain when to use explode in Hive?
What is hfile ?
Why replication is required in Kafka?
How does bloom filter help in searching rows?
What is spark etl?
What is the difference between apache mahout and prediction.io ?
What happens if the quantity of the reducer is 0 in mapreduce?
why should we use 'group' keyword in pig scripts?
What are accumulators in spark?
What are the different operational commands in HBase at record level and table level?
Explain the Differences between Hive and Spark SQL?