What is the characteristic of streaming API that makes it flexible run MapReduce jobs in languages like Perl, Ruby, Awk etc.?
452Post New Hadoop General Questions
What are the important features of hadoop?
Ideally what should be replication factor in a Hadoop cluster?
List some use cases where classification machine learning algorithms can be used.
What is the key difference between NameNode and DataNode in Hadoop?
Which port does SSH work on?
Is it possible to have hadoop job output in multiple directories? If yes, how?
What do you understand by standalone (or local) mode?
What are the port numbers of task tracker?
What is partitioning?
Ideally what should be the block size in hadoop?
Define data cleansing?
What are the features of Pseudo mode?
Explain the difference between an inputsplit and a block?
What happens if you get a ‘connection refused java exception’ when you type hadoop fsck /?
Explain what is jobtracker in hadoop? What are the actions followed by hadoop?