Hadoop Interview Questions
Questions Answers Views Company eMail

Explain what is shuffling in mapreduce?

472

Explain what is distributed cache in mapreduce framework?

476

Mention what are the main configuration parameters that user need to specify to run mapreduce job?

591

Explain what is the function of mapreduce partitioner?

488

Explain what is heartbeat in hdfs?

29

Explain what is a difference between an input split and hdfs block?

25

Explain how indexing in hdfs is done?

25

Mention what is the best way to copy files between hdfs clusters?

65

Mention what is the difference between hdfs and nas?

61

What is a difference between an input split and hdfs block?

56

Mention what is the data storage component used by hadoop?

311

Mention what does the text input format do?

305

Mention what daemons run on a master node and slave nodes?

365

Explain what is namenode in hadoop?

319

Explain what is a sequence file in hadoop?

347


Un-Answered Questions { Hadoop }

Define "Transformations" in Spark

285


List some use cases where classification machine learning algorithms can be used.

304


How Hadoop is cost-effective?

307


What is scala and spark?

226


What is the Reducer used for?

786


What is the difference between Cassandra, Pig and Hive?

735


What is version-id mismatch error in hadoop?

892


What is a commodity hardware? Does commodity hardware include RAM?

360


What is sink in flume?

71


Explain the Parquet File format in Apache Spark. When is it the best to choose this?

303


What is Chain Mapper?

378


Can you define rdd lineage?

237


How does Cassandra perform write function?

74


What is lineage graph in spark?

229


What is python stress test in cassandra?

60