Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Define various running modes of apache spark?
What will be the output of cast ('XYZ' as INT)?
What did mean by data-node?
Explain about the partitioning, shuffle and sort phase in MapReduce?
What are the characteristics of hadoop framework?
How often DataNode send heartbeat to NameNode in Hadoop?
Why do we need buckets?
What is application master in spark?
What is Starvation scenario in spark streaming?
What does map transformation do? Provide an example.
What is SerDe in Apache Hive ?
How to change replication factor of files already stored in HDFS?
When to use Avro, explain?
Is it possible to leverage real time analysis on the big data collected by flume directly? If yes, then explain how?
What is flume and kafka?