Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is sparkcontext in spark?
How do you do a file system check in hdfs?
What do you understand by the partitions in spark?
Difference between order by and sort by in Hive?
Did you ever ran into a lop sided job that resulted in out of memory error, if yes then how did you handled it ?
Why spark is faster than hadoop?
What is spark slang for?
Explain Spark Executor
Explain repository in apache ambari?
How to use hdfs put command for data transfer from flume to hdfs?
What type of data we should put in distributed cache? When to put the data in dc? How much volume we should put in?
Can spark be used without hadoop?
What are the complex datatypes in pig?
What will happen in case you have not issued the command: ‘set hive.enforce.bucketing=true;’ before bucketing a table in hive in apache hive 0.x or 1.x?
What is hector?