Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is lazy evaluation in Spark?
Is fs.mapr.working.dir a single directory?
Can we run spark on windows?
What is mllib?
What happens if the preferred replica is not in the isr?
How do you set up a spark?
Explain a common use case for Flume?
What is throughput? How does HDFS get a good throughput?
What is pipelined rdd?
Double type in Hive - Important points?
Name few companies that are the uses of apache spark?
How will you backup an HBase cluster?
What is scala and spark?
What is executor memory and driver memory in spark?
Can you use Spark to access and analyse data stored in Cassandra databases?