Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
What is the relationship between hdfs, hbase, pig, hive and azkaban?
How can you trigger automatic clean-ups in Spark to handle accumulated metadata?
What problems have you faced when you are working on Hadoop code?
Explain the difference between gen1 and gen2 hadoop with regards to the namenode?
What are the most common InputFormats in Hadoop?
Is Hive supports Temporary Tables?
Is impala intended to handle real time queries in low-latency applications or is it for ad hoc queries for the purpose of data exploration?
How does A/B testing work?
What all tasks you can perform for managing services using Ambari service tab?
What is the replica placement Strategy in Cassandra ?
What are the different clustering in mahout?
Explain Cqlsh?
If reducers do not start before all mappers finish then why does the progress on mapreduce job shows something like map(50%) reduce(10%)? Why reducers progress percentage is displayed when mapper is not finished yet?
Explain Spark coalesce() operation?
What are the execution modes in the apache pig?