Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
What is document store db? Explain with an example.
Define Spark-SQL?
What is difference between dataset and dataframe?
How to iterate all rows in ColumnFamily?
What is key-value store db?
Describe Partition and Partitioner in Apache Spark?
What is Apache Spark and what are the benefits of Spark over MapReduce?
Explain the concept of Tunable Consistency in Cassandra?
What is distributed copy (distcp)?
What is Mapper in Hadoop MapReduce?
What is row in hbase?
Data Engineer Given a list of followers in the format:123, 345234, 678345, 123…Where column one is the ID of the follower and column two is the ID of the followee. Find all mutual following pairs (the pair 123, 345 in the example above). How would you use Map/Reduce to solve the problem when the list does not fit in memory?
Why is jconsole used?
What do hive variable means? How can we utilize it?
Describe Memtable?