Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Give some points of hive for hadoop ?
What are shared variables in Apache Spark?
What is configured in /etc/hosts and what is its role in setting Hadoop cluster?
What is winutils hadoop?
How are joins performed in impala?
What are shared variables?
What is a row in cassandra?
What is the role of JDBC driver in Sqoop?
What are 'slaves' and 'masters' in Hadoop?
What in hive made out of?
Are spark dataframes immutable?
Explain what is difference between an input split and hdfs block?
Mention what are the main configuration parameters that user need to specify to run mapreduce job?
What is the difference between coalesce and repartition in spark?
What is the default replication factor in Hadoop and how will you change it?