Hadoop Interview Questions, Answers for Freshers and Experienced asked in various Company Job Interviews

Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)

Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)

Hadoop Interview Questions

Questions Answers Views Company eMail

Explain a simple Map/Reduce problem.

Capital One,

718

Data Engineer Given a list of followers in the format:123, 345234, 678345, 123â€¦Where column one is the ID of the follower and column two is the ID of the followee. Find all mutual following pairs (the pair 123, 345 in the example above). How would you use Map/Reduce to solve the problem when the list does not fit in memory?

Twitter,

748

How would you use Map/Reduce to split a very large graph into smaller pieces and parallelize the computation of edges according to the fast/dynamic change of data?

Twitter,

694

Write a Hive UDF that returns a sentiment score. For example, if good = 1, bad = -1, and average = 0, then a review of a restaurant states "Good food, bad service," your score might be 1 - 1 = 0.

LinkedIn,

684

Explain how RDDs work with Scala in Spark

Capital One,

341

Define HRegionServer in HBase

157

What is the use of shutdown command?

210

What is HBase HMaster?

360

What is the function of HMaster?

255

Suppose that your data is stored in collections, for instance, some binary data, message data or metadata is all keyed on the same value. Will you use HBase for this?

158

Explian the Limitations of HBase?

187

State some applications of HBase?

167

What is a Column family in hbase?

196

Discuss about the different tombstone markers used for deletion purposes in HBase.?

185

Explain the Scope operators used in hbase?

182

Un-Answered Questions { Hadoop }

Define the term ‘Lazy Evolution’ with reference to Apache Spark

304

What is the Physical plan in pig architecture?

531

How the write operation is performed on Cassandra node ?

What is Catalyst framework?

292

What is the procedure of data storage in cassandra?

What are the main features and Characteristics of Hadoop which makes it the most popular and powerful Big Data tool?

489

What if a namenode has no data?

667

Explain why are replications critical in kafka?

512

Explain some Disadvantages of Avro?

Explain how can you minimize data transfers when working with spark?

436

What is a generic UDF in the hive?

757

Why do we need Hadoop Archives? How is it created?

563

The difference between GROUP and COGROUP operators in Pig?

565

what is Memtable in Cassandra?

109

What are the advantages of DataSets?

291

For More Un-Answered { Hadoop } Questions Click Here