Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is Grunt shell?
How do you organize the pig latin statements?
What are file permissions in HDFS and how HDFS check permissions for files or directory?
What are broadcast variables in Apache Spark? Why do we need them?
Illustrate a simple example of the working of MapReduce.
Explain the various table design approaches in HBase?
Mention how can you stop a partition form being queried?
What is the non dfs used?
What are the components of Apache Pig platform?
What do you mean by consistency in Cassandra?
Did you ever ran into a lop sided job that resulted in out of memory error, if yes then how did you handled it ?
What is the default replication factor in Hadoop and how will you change it?
How to change Replication Factor For below cases ?
Explain about the basic parameters of mapper and reducer function
What types of costs are associated with creating the index on hive tables?