Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Can you explain sequence file in hadoop?
Differentiate between the terms: node, a cluster, and data center in cassandra?
What is a dataset? What are its advantages over dataframe and rdd?
What is the current version of Hive?
Can the region server will be located on all datanodes?
Where the mapper's intermediate data will be stored?
Mention the date data type in hive. Name the hive data type collection.
How is RDD in Apache Spark different from Distributed Storage Management?
What is Bucketing and Clustering in Hive?
Name different types of primary keys in Cassandra?
Does Spark provide the storage layer too?
How to get the single file as the output from MapReduce Job?
Does Flume provide 100% reliability to the data flow?
Give me an example of document database ?
How do you run pig scripts on kerberos secured cluster?