Data Engineer Given a list of followers in the format:123, 345234, 678345, 123…Where column one is the ID of the follower and column two is the ID of the followee. Find all mutual following pairs (the pair 123, 345 in the example above). How would you use Map/Reduce to solve the problem when the list does not fit in memory?
779How would you use Map/Reduce to split a very large graph into smaller pieces and parallelize the computation of edges according to the fast/dynamic change of data?
722Post New Apache Hadoop Questions
What does ‘jps’ command do?
What are watches?
Data Engineer Given a list of followers in the format:123, 345234, 678345, 123…Where column one is the ID of the follower and column two is the ID of the followee. Find all mutual following pairs (the pair 123, 345 in the example above). How would you use Map/Reduce to solve the problem when the list does not fit in memory?
Which files are used by the startup and shutdown commands?
Define streaming access?
Did you ever built a production process in hadoop ? If yes then what was the process when your hadoop job fails due to any reason?
What are the modes in which Apache Hadoop run?
What is Derby database?
What happens in a textinputformat?
Can Apache Kafka be used without Zookeeper?
How is HDFS fault tolerant?
On what basis data will be stored on a rack?
how would you modify that solution to only count the number of unique words in all the documents?
Explain the features of pseudo mode?
Knox and Hadoop Development Tools?