Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Give me examples of unstructured data?
Explain when using field grouping in storm, is there any time-out or limit to known field values?
What is the difference between python and spark?
List the functions of Spark SQL?
Which command is available to show the current HBase user?
What is a rack?
What is salting in spark?
What is IdentityMapper?
Where is the Mapper Output intermediate kay-value data stored ?
What are the main features of impala?
What is shuffle in spark?
Explain about the partitioning, shuffle and sort phase
How data or a file is written into hdfs?
Define fault tolerance?
What is the benifit of Distributed cache, why can we just have the file in HDFS and have the application read it?