Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
How would you tackle counting words in several text documents?
What is a IdentityMapper and IdentityReducer in MapReduce ?
what does /*streamtable(table_name)*/ do?
What is LazyOutputFormat in MapReduce?
Can I run an ensemble cluster behind a load balancer?
Explain the use of .mecia class?
Where is Mapper output stored?
Who are ‘Data Scientists’?
Mention key components of Hive Architecture?
What are Apache Spark, Flume, Lucene, Hama, HCatalog, Mahout, Drill, Crunch and Thrift?
What happens to a namenode, when job tracker is down?
What is a reliable and unreliable receiver in Spark?
What are the features of RDD, that makes RDD an important abstraction of Spark?
Explain what is the role of the zookeeper?
What are the benefits of Spark lazy evaluation?