Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
Explain the difference between nas and hdfs?
What is cqlsh? And why is it used?
did you maintain the hadoop cluster in-house or used hadoop in the cloud?
What is map side join?
Explain accumulators in apache spark.
How to iterate all rows in ColumnFamily?
Name the management tools in Cassandra?
What is column families? What happens if you alter the block size of ColumnFamily on an already populated database?
What is the relationship between Hadoop, HBase, Hive and Cassandra ?
Who uses Cassandra?
What do you mean by logistic regression?
Ideally what should be replication factor in a Hadoop cluster?
Differentiate HDFS & HBase?
How does impala process join queries for large tables?
What are the relational databases supported in sqoop?