Hadoop Interview Questions
Questions Answers Views Company eMail

What is the standalone mode in spark cluster?

205

Explain apache spark streaming? How is the processing of streaming data achieved in apache spark?

250

In what ways sparksession different from sparkcontext?

282

Explain fold() operation in spark?

256

Define sparkcontext in apache spark?

229

List out the various advantages of dataframe over rdd in apache spark?

236

What is map in apache spark?

224

Write the command to start and stop the spark in an interactive shell?

225

Define various running modes of apache spark?

239

What are the ways to run spark over hadoop?

258

What is catalyst query optimizer in apache spark?

242

What are the various types of shared variable in apache spark?

235

Define the common faults of the developer while using apache spark?

248

What is the use of spark driver, where it gets executed on the cluster?

252

What is speculative execution in spark?

276


Un-Answered Questions { Hadoop }

When Hive is run in embedded mode

1692


How much is flume worth?

63


What is the problem with small files in Hadoop?

331


What do you mean by Stream Processing in Kafka?

402


What is Apache Zookeeper Meant For?

5


Explain about the execution plans of a pig script?
or
differentiate between the logical and physical plan of an apache pig script?

343


Why is BlinkDB used?

258


How does hdfs provides good throughput?

42


What is Federation?

307


what is the typical block size of an HDFS block?

750


Why there is need of pig language?

439


How many types of tunable consistency are supported in Cassandra?

81


What is spark yarn executor memoryoverhead?

313


Describe Partition and Partitioner in Apache Spark?

258


What is the significance of ‘IF EXISTS” clause while dropping a table?

535