Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Define data integrity? How does hdfs ensure data integrity of data blocks stored in hdfs?
What is tungsten in spark?
What is sc textfile?
Explain partitions?
Why do we need hdfs?
Define yum?
What are the befefits of nosql over relational database?
What is Apache Flume?
Explain how HDFS communicates with Linux native file system?
Explain Clustering in Hive?
What is difference between spark and hadoop?
What services run after running hbase job?
How do you write your own SerDe?
how is a file of the size 1 GB uncompressed
Where does Spark Driver run on Yarn?