Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
List the popular use cases of Apache Spark?
Is there any difference between FileSink and FileRollSink?
What is graph db? Explain with an example.
What is cluster in apache spark?
What are the execution modes in the apache pig?
What is kafka?
What are the relational operators available related to Grouping and joining in Pig language?
What happens when the data set exceeds available memory?
Explain first() operation in Apache Spark RDD?
What is the use of "cqlsh --version" command?
What is the use of explode in Hive?
Explain what is the role of the zookeeper?
What relational operators can we use that are related to combining and splitting in Pig language?
What are the limitations of importing RDBMS tables into Hcatalog directly?
For a Hadoop job, how will you write a custom partitioner?