Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Does impala performance improve as it is deployed to more hosts in a cluster in much the same way that hadoop performance does?
164What is RDD in Apache Spark? How are they computed in Spark? what are the various ways in which it can create?
301What role does worker node play in Apache Spark Cluster? And what is the need to register a worker node with the driver program?
334
Can multiple clients write into a Hadoop HDFS file concurrently?
Explain the use of tasktracker in the hadoop cluster?
Explain briefly what is Action in Apache Spark? How is final result generated using an action?
While installing, why does apache have three config files - srm.conf, access.conf and httpd.conf?
What is Apache Zookeeper Meant For?
How does job tracker schedule a job for the task tracker?
Is nosql follow relational db model?
What is apache spark engine?
List the network requirements for using Hadoop ?
What's the best way to copy files between HDFS clusters?
What is client mode in spark?
What is data skew in spark?
What are the tools used in big data?
What does the "USE" command in hive do?
What is Writable & WritableComparable interface?