Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is sc parallelize?
What are the data types of Pig Latin?
What is spark client?
How can you use consumer api?
Is there any point of learning mapreduce, then?
What load do concurrent queries produce on the namenode?
What is the difference between Hadoop and RDBMS?
What is Clustring in Hive?
Explain Spark SQL caching and uncaching?
What does rack awareness algorithm means and why is it utilized as a part of hadoop?
How is HDFS fault tolerant?
Describe SPM?
Define HDFS and talk about their respective components?
State the usage of 'filters', 'group' , 'orderBy', 'distinct' keywords in pig scripts?
What is LazyOutputFormat in Hadoop?