Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
What is Starvation scenario in spark streaming?
Explain fold() operation in spark?
Explain what do you understand by cassandra- cql collections?
What is kafka logs?
Why do fires spark?
What do you understand by node in cassandra?
Explain what you understand by speculative execution
Define yum?
What is the characteristic of streaming API that makes it flexible run MapReduce jobs in languages like Perl, Ruby, Awk etc.?
Define lzo?
What is the distinction between apache driver and apache spark’s mllib?
Do we need to give a password, even if the key is added in ssh?
What is the reason for creating a new metastore_db whenever Hive query is run from a different directory?
What do you mean by replication factor?
Replication causes data redundancy then why is pursued in hdfs?