Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
What is a rack awareness algorithm?
What are the major features/characteristics of rdd (resilient distributed datasets)?
What is KeyValueTextInputFormat in Hadoop?
Define tasktracker.
What do you understand by the partitions in spark?
On what basis data will be stored on a rack?
What is Clustring in Hive?
Is spark difficult to learn?
If no custom partitioner is defined in Hadoop then how is data partitioned before it is sent to the reducer?
Compare Hadoop and RDBMS?
What are the steps involved in MapReduce framework?
Name a few import control commands. How can Sqoop handle large objects?
Do we need hadoop for spark?
What are the languages supported by apache spark and which is the most popular one?
Explain sum(), max(), min() operation in Apache Spark?