Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What are the advantage of spark?
Are there any problems which can only be solved by MapReduce and cannot be solved by PIG? In which kind of scenarios MR jobs will be more useful than PIG?
What platform and Java version is required to run Hadoop?
Explain how you can improve the throughput of a remote consumer?
Does Cassandra support ACID transactions?
How the HDFS Blocks are replicated?
What bit version that ambari needs and also list out the operating systems that are compatible?
What do you understand by the parquet file?
What does rack awareness mean?
Which storage level does the cache () function use?
What are the various modes in which Spark runs on YARN? (Local vs Client vs Cluster Mode)
Can we broadcast an rdd?
Is JDBC driver enough to connect sqoop to the databases?
Give some points of pig for hadoop ?
What is single node cluster in Hadoop? for what all purposes Hadoop run on a single node cluster?