Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) what is a cluster in cassandra?
Explain what is shuffling in mapreduce?
Can you explain spark rdd?
explain Metadata in Namenode?
What are shared variables?
What is a TaskInstance?
What is Mapper? How can we compress Mapper output in Hadoop?
Compare Hadoop and Spark?
Explain the process to trigger automatic clean-up in Spark to manage accumulated metadata.
Input Split & Record Reader and what they do?
Where can I get sample data to try?
How to access HDFS?
What is the relationship between hdfs, hbase, pig, hive and azkaban?
Explain HCatInputFormat and HCatOutputFormat?
What is the utilization of hcatalog?