Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Explain the term paired RDD in Apache Spark?
What are the file formats that Hive supports and can use be used for storage?
Explain about trformations and actions in the context of rdds?
What do sorting and shuffling do?
Where can I find impala documentation?
How can you send some messages in kafka?
What size is recommended for each node?
Explain in brief what is the architecture of Spark?
How data or a file is written into hdfs?
What is atom in pig?
Explain the usage of Context Object?
Explain SparkContext in Apache Spark?
Explain the zookeeper workflow?
What is the difference between SQL and NoSQL?
What is column families? What happens if you alter the block size of ColumnFamily on an already populated database?