Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) State the usage of 'filters', 'group' , 'orderBy', 'distinct' keywords in pig scripts?
What is cloudera and why it is used?
What is the significance of using –compress-codec parameter?
What is the purpose of DataNode block scanner?
What do you understand by mapreduce?
How much space will the split occupy in Mapreduce?
What is hadoop sqoop?
How many Mappers run for a MapReduce job?
Describe HDFS Federation?
Why is flume used?
Which command do we use to insert data in HCatalog?
Why do we need MapReduce during Pig programming?
How hbase uses zookeeper?
Does impala use caching?
Can you explain how it is different from doing machine learning in r or sas?