Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What are the main features and Characteristics of Hadoop which makes it the most popular and powerful Big Data tool?
What is apache spark and what is it used for?
How Pig differs from MapReduce?
How much memory is required?
Why Mapper runs in heavy weight process and not in a thread in MapReduce?
In how many ways RDDs can be created? Explain.
Is it possible to use Apache Spark for accessing and analyzing data stored in Cassandra databases?
Is spark sql a database?
Can we use Ambari Python Client to use of Ambari API’s?
Can we do real-time processing using spark sql?
Why is there a need for broadcast variables when working with Apache Spark?
What is the work of Export in Hadoop sqoop?
How message is consumed by consumer in kafka?
What is the maximum size of string data type supported by Hive?
What is the role of Zookeeper in HBase architecture?