What should be the ideal replication factor in Hadoop Cluster?
Answers were Sorted based on User's Feedback
Is hadoop required for data science?
Suppose Hadoop spawned 100 tasks for a job and one of the task failed. What will Hadoop do?
What is the jobtracker and what it performs in a hadoop cluster?
What are the different types of Znodes?
What are the most commonly defined input formats in Hadoop?
What is the main purpose of HDFS fsck command?
Explain how input and output data format of the hadoop framework?
Is map like a pointer?
Data Engineer Given a list of followers in the format:123, 345234, 678345, 123…Where column one is the ID of the follower and column two is the ID of the followee. Find all mutual following pairs (the pair 123, 345 in the example above). How would you use Map/Reduce to solve the problem when the list does not fit in memory?
Why do we use HDFS for applications having large data sets and not when there are lot of small files?
How to enable recycle bin in hadoop?
What is HDFS ? How it is different from traditional file systems?