What is the problem with HDFS and streaming data like logs
No Answer is Posted For this Question
Be the First to Post Answer
what are the nodes in the Hadoop cluster?
What does job conf class do?
What is a Secondary Namenode? Is it a substitute to the Namenode?
What is a speculative execution in Apache Hadoop MapReduce?
What is 'Key value pair' in HDFS?
Data Engineer Given a list of followers in the format:123, 345234, 678345, 123…Where column one is the ID of the follower and column two is the ID of the followee. Find all mutual following pairs (the pair 123, 345 in the example above). How would you use Map/Reduce to solve the problem when the list does not fit in memory?
How can we check whether namenode is working or not?
Why do we need a password-less ssh in fully distributed environment?
What does the command mapred.job.tracker do?
How to write a Custom Key Class?
What is Rack awareness?
Define tasktracker.
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)