Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is a generic udf in hive?
How does Cassandra perform write function?
What is a block in Hadoop HDFS? What should be the block size to get optimum performance from the Hadoop cluster?
What happens when we submit a spark job?
Explain what is logging in Cassandra?
What happens if the block on Hadoop HDFS is corrupted?
Name the most common Input Formats defined in Hadoop? Which one is default?
What happens to zk sessions while the cluster is down?
What are the different data formats supported by apache tajo?
What is Small File Problem in Hadoop? How can it be resolved?
Whether the output of mapper or output of partitioner written on local disk?
Where is the output of Mapper written in Hadoop?
How is RDD in Spark different from Distributed Storage Management?
What is Yum?
What is the difference between Spark Transform in DStream and map ?