Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is heartbeat in hdfs?
Define the use of Source Command in Cassandra?
What is SerDe in Apache Hive ?
Explain Spark join() operation?
explain Metadata in Namenode?
How is Flume-NG different from Flume 0.9?
While loading data into a hive table using the load data clause, how do you specify it is a hdfs file and not a local file ?
How data or file is written into HDFS?
Is it possible to search for files using wildcards?
What are the ways to launch Apache Spark over YARN?
What is the key- value pair in MapReduce?
What are common spark ecosystems?
What do you understand by snitches?
Can I run an ensemble cluster behind a load balancer?
If the source data gets updated every now and then, how will you synchronize the data in hdfs that is imported by sqoop?