Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is Apache Spark Machine learning library?
Is piglatin a strongly typed language? If yes, then how did you come to the conclusion?
How would you tackle calculating the number of unique visitors for each hour by mining a huge apache log? You can use post processing on the output of the mapreduce job.
When you should use Hbase?
What are the different tools used for the ambari monitoring purpose?
What is Slot in Hadoop v1? Why was it removed from Hadoop v2?
Explain how to write the output into a file using storm?
Explain the term Cluster?
Explain cassandra.
When to use Cassandra?
Explain schemardd?
In case of embedded Hive, can the same metastore be used by multiple users?
How you can contact your client everyday ?
What are the fundamental configurations parameters specified in map reduce?
What is spark databricks?