Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Define ttl in hbase?
State some advantages of impala?
How can you set an arbitrary number of Reducers to be created for a job in Hadoop?
Why is Cassandra popular? Clarify.
What is master node in spark?
Which type of data HBase can store?
Is a distributed machine learning framework on top of spark?
How to set the number of reducers?
Can you define the process of creating ambari client?
What is the problem with small files in Apache Hadoop?
Is rdd type safe?
How is machine learning implemented in spark?
What is sc textfile?
What is the purpose of RawComparator interface?
How do you stop a running job gracefully?