Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
Does spark sql use hive?
Where is rdd stored?
How do you deal with sparse data?
Which are the methods in the mapper interface?
Why do we need Hadoop Archives? How is it created?
What is the throughput? How does hdfs give great throughput?
What is difference between reducer and combiner?
What are the default read and write classes in Hive?
what do you mean by data processing?
Explain about the different cluster managers in Apache Spark
What is the default extension of the files produced from a sqoop import using the –compress parameter?
What are the various libraries available on top of Apache Spark?
What is Internal and External table in Hive?
What are the limitations of Spark?
Define hadoop archives? What is the command for archiving a group of files in hdfs.