What are the most commonly defined input formats in Hadoop?
What is the problem with HDFS and streaming data like logs
What is a Record Reader in hadoop?
What is the functionality of jobtracker in hadoop? How many instances of a jobtracker run on hadoop cluster?
How a task is scheduled by a jobtracker?
What is high availability in hadoop?
Define a task tracker?
What is a task instance in hadoop? Where does it run?
What problems can be addressed by using Zookeeper?
What is Schema on Read and Schema on Write?
Virtual Box & Ubuntu Installation?
What is unstructured data?
what is difference between int and intwritable?