Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What are the advantages of DataFrame?
How do I optimize my spark code?
Define partitioning key?
What are the advantages of using map side join in mapreduce?
How do I use spark with big data?
List few benefits of spark over map reduce?
What is Shuffling and Sorting in a MapReduce?
Where sorting is done on mapper node or reducer node in MapReduce?
What is external shuffle service in spark?
What are the different types of UDF's in Java supported by Apache Pig?
What is spark checkpointing?
Explain avrostorage function?
Is spark better than mapreduce?
What is data cleansing?
What is graph db?