Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What happens if one hadoop client renames a file or a directory containing this file while another client is still writing into it?
Can you explain accumulators in apache spark?
Can you explain worker node?
What is hdfs spark?
What is spark rdd?
If map reduce is inferior to spark then is there any benefit of learning it?
What is cloudera and why it is used?
Define data integrity? How does hdfs ensure data integrity of data blocks stored in hdfs?
Explain data versioning?
How do I download apache mahout?
What mechanism does hadoop framework provides to synchronize changes made in distribution cache during runtime of the application?
How should you handle session_expired?
What are the components of spark?
What is the input type/format in MapReduce by default?
Explain about tajo worker configuration?