Why do we need a password-less ssh in fully distributed environment?
No Answer is Posted For this Question
Be the First to Post Answer
What is a 'block' in HDFS?
Explain the basic difference between traditional rdbms and hadoop?
How to enable recycle bin or trash in hadoop?
Knox and Hadoop Development Tools?
What are the functions of NameNode?
What is the problem with small files in Apache Hadoop?
Explain what happens in textinformat ?
Can hbase run without hadoop?
How would you tackle counting words in several text documents?
did you maintain the hadoop cluster in-house or used hadoop in the cloud?
Data Engineer Given a list of followers in the format:123, 345234, 678345, 123…Where column one is the ID of the follower and column two is the ID of the followee. Find all mutual following pairs (the pair 123, 345 in the example above). How would you use Map/Reduce to solve the problem when the list does not fit in memory?
Which one is default InputFormat in Hadoop ?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)