Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Can we say cogroup is a group of more than 1 data set?
Which classes are used by the hive to read and write hdfs files?
What is UDF in Pig?
Explain the RDD properties?
Explain the various table design approaches in HBase?
If you run hive as a server, what are the available mechanism for connecting it from application?
Why we need impala hadoop?
What is the difference between piglatin and hiveql?
Explain what is jobtracker in hadoop? What are the actions followed by hadoop?
Explain what are the basic parameters of a mapper?
Does Pig give any warning when there is a type mismatch or missing field?
Can you briefly explain the apache mahout?
Explain what is hadoop?
What is SerDe in Apache Hive ?
How to set the number of mappers to be created in MapReduce?