Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) State the usage of 'filters', 'group' , 'orderBy', 'distinct' keywords in pig scripts?
What happens if one hadoop client renames a file or a directory containing this file while another client is still writing into it?
What do you mean by metadata in HDFS? Where is it stored in Hadoop?
On what basis Namenode will decide which datanode to write on?
Wherever (Different Directory) I run hive query, it creates new metastore_db, please explain the reason for it?
What is UDF?
Explain different transformation on DStream?
Define Partition and Partitioner in Apache Spark?
What does apache mahout do?
What is the use of coordinator node in read?
What are the disadvantages of using Spark?
What is difference between spark and kafka?
Why is flume used?
Explain the Constituents of Apache ZooKeeper Architecture?
Explain the memtable in cassandra?