Hadoop Interview Questions
Questions Answers Views Company eMail

What is the user of sparkContext?

269

How is the processing of streaming data achieved in Apache Spark? Explain.

240

Can you do real-time processing with Spark SQL?

271

Discuss the role of Spark driver in Spark application?

243

What are the features of RDD, that makes RDD an important abstraction of Spark?

227

What is Apache Spark? What is the reason behind the evolution of this framework?

234

What are accumulators in Apache Spark?

266

What is the reason behind Transformation being a lazy operation in Apache Spark RDD? How is it useful?

339

Explain about the different types of trformations on dstreams?

255

Describe the run-time architecture of Spark?

244

What is the FlatMap Transformation in Apache Spark RDD?

246

can you run Apache Spark On Apache Mesos?

263

Describe Partition and Partitioner in Apache Spark?

258

Describe Accumulator in detail in Apache Spark?

269

List down the languages supported by Apache Spark?

233


Un-Answered Questions { Hadoop }

What services run after running hbase job?

153


What are the use cases of Apache Pig?

605


Explain the difference between an hdfs block and input split?

55


Is there any difference between HBase datamodel and RDBMS datamodel?

1057


Explain what is difference between an input split and hdfs block?

53


Do we need to install scala for spark?

260


List out the some common problems faced by data analyst?

298


What is the row key?

134


What is anti-entropy and how is it associated with merkel tree?

56


Explain how you can get exactly once messaging from kafka during data production?

392


Clarify what jobtracker is in hadoop? What are the activities followed by hadoop?

327


How is RDD in Spark different from Distributed Storage Management?

335


What are brokers in kafka?

371


What is Bucketing and Clustering in Hive?

526


List the various HDFS daemons in HDFS cluster?

31