Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) What is the difference between Job and Task in MapReduce?
What is Sqoop Validation?
What is a block?
What is the use of rdd in spark?
How data or file is written into HDFS?
What are Actions?
Explain how mapreduce works.
What is aggregatebykey spark?
What is a partition in Hive?
What is the difference between external table and managed table?
Since the data is replicated thrice in hdfs, does it mean that any calculation done on one node will also be replicated on the other two?
What are the commonalities between pig and hive?
What is spark context spark session?
How to change the replication factor of data which is already stored in HDFS?
Can you explain difference between apache mahout and apache spark’s mllib?