When we send a data to a node, do we allow settling in time, before sending another data to that node?
What are the benefits yarn brings in to hadoop?
What are the main components of hadoop?
Is it possible to have hadoop job output in multiple directories?
Hadoop achieves parallelism by dividing the tasks across many nodes, it is possible for a few slow nodes to rate-limit the rest of the program and slow down the program. What mechanism Hadoop provides to combat this?
What is Combiner in Hadoop?
What is CAP Theorem? What aspects does Hadoop support from this theorem?
How does job tracker schedule a job for the task tracker?
Can you explain speculative execution?
What happens if you get a ‘connection refused java exception’ when you type hadoop fsck /?
Name the operating system(s) which are supported for production hadoop deployment?
What is a udf?
In which location name node sores its metadata and why?
What is Federation?
Which port does SSH work on?