Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
How often do you need to reformat the namenode?
Can you explain sqoop metastore?
What is jmx? And how is it useful in cassandra?
Explain what is speculative execution?
Why do we need sparkcontext?
Is kafka big data?
Define a record reader?
UPPER or UCASE function in Hive with example?
What is the functionality of Query Processor in Apache Hive?
If there is certain data that we want to use again and again in different transformations, what should improve the performance?
What is the major difference between local and remote meta-store?
What is small file problem in hadoop?
Explain various level of persistence in Apache Spark?
List various commonly used machine learning algorithm?
What makes Apache Spark good at low-latency workloads like graph processing and machine learning?