Data Engineer Given a list of followers in the format:123, 345234, 678345, 123…Where column one is the ID of the follower and column two is the ID of the followee. Find all mutual following pairs (the pair 123, 345 in the example above). How would you use Map/Reduce to solve the problem when the list does not fit in memory?
712How would you use Map/Reduce to split a very large graph into smaller pieces and parallelize the computation of edges according to the fast/dynamic change of data?
651Post New Apache Hadoop Questions
On what basis name node distribute blocks across the data nodes?
Explain how can we change the split size if our commodity hardware has less storage space?
Does hadoop always require digital data to process?
What is the default block size in hdfs?
Are Namenode and job tracker on the same host?
Explain the use of tasktracker in the hadoop cluster?
Why do we need a password-less ssh in fully distributed environment?
What are the different types of Znodes?
Can Hadoop be compared to NOSQL database like Cassandra?
What are the core components of Apache Hadoop?
Is map like a pointer?
How to enable recycle bin in hadoop?
What is the default block size in Hadoop 1 and in Hadoop 2? Can it be changed?
What are the site-specific configuration files in Hadoop?
How to keep HDFS cluster balanced?