Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Explain what is hadoop?
What are the components of apache ambari architecture?
Can you join multiple fields in Apache
How kafka communicate with clients and servers?
What can skew the mean?
What is a DStream?
What is spark vectorization?
What is the major difference between local and remote meta-store?
What is JPS? Why is it used in Hadoop?
What is the difference between MapReduce engine and HDFS cluster?
What is structured data?
Which is better scala or python for spark?
While loading data into a hive table using the load data clause, how do you specify it is a hdfs file and not a local file ?
What are the general Prerequisites to learn HCatalog?
What are the actions in spark?