Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
How to enable buckets in Hive?
What role does worker node play in Apache Spark Cluster? And what is the need to register a worker node with the driver program?
What is the use of expand cqlsh command in Cassandra?
Explain pig architecture?
How do we create rdds in spark?
What is Spark Streaming?
Explain what is sequencefileinputformat?
How do ‘map’ and ‘reduce’ work?
Is it necessary to start Hadoop to run any Apache Spark Application ?
What is kafka?
What is hadoop framework?
How Spark uses Akka?
What do you understand by the parquet file?
What is the characteristic of streaming API that makes it flexible run MapReduce jobs in languages like Perl, Ruby, Awk etc.?
What do you know about the speculative execution?