Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Where is hadoop-env.sh file present?
What is Spark DataFrames?
Can you define sqoop in hadoop?
If there is certain data that we want to use again and again in different transformations, what should improve the performance?
How does hdfs provides good throughput?
What is the difference between Hadoop and RDBMS?
What is the purpose of JConsole?
Can you explain apache spark?
Explain how can apache spark be used alongside hadoop?
If reducers do not start before all mappers finish then why does the progress on mapreduce job shows something like map(50%) reduce(10%)? Why reducers progress percentage is displayed when mapper is not finished yet?
Explain how indexing is done in hdfs?
What is fluming?
What are problems with small files and hdfs?
what are views in Hive?
What is the use of flatmap in spark?