Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Hadoop achieves parallelism by dividing the tasks across many nodes, it is possible for a few slow nodes to rate-limit the rest of the program and slow down the program. What mechanism Hadoop provides to combat this?
461Is it possible to provide multiple input to Hadoop? If yes then how can you give multiple directories as input to the Hadoop job?
439
how to share the metastore within multiple users?
LOWER or LCASE function in Hive with example?
Why does hive not store metadata information in hdfs?
Explain what is the function of mapreduce partitioner?
What will be the consideration while we do Hardware Planning for Master in Hadoop architecture?
Can I install spark on windows?
Name the types of tunable consistency?
Explain how HCatalog enables right tool for right Job?
What are barriers?
Explain the uses of Map Reduce in Pig?
What is dataframe api?
Explain what is memtable in cassandra?
What is the biggest shortcoming of Spark?
What is the primary purpose of the pig in the hadoop architecture?
What happens if there is an error in impala?