Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
How can we change the split size if our commodity hardware has less storage space?
How can we import data from particular row or column? What is the destination types allowed in Sqoop import command?
What are the relational operators available related to loading and storing in pig language?
What is setmaster in spark?
Map reduce jobs are failing on a cluster that was just restarted. They worked before restart. What could be wrong?
What are the data components used by Hadoop?
What is identity mapper and reducer? In which cases can we use them?
What is broadcast variable?
What is the role of alter keyspace?
Can we run unix shell commands from hive? Can hive queries be executed from script files? How? Give an example.
What happen if the number of the reducer is 0 in MapReduce?
Can you define rdd?
Why do the nodes are removed and added frequently in a hadoop cluster?
Explain bucketing in Hive?
What is difference between client and cluster mode in spark?