Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) What is Apache Spark? What is the reason behind the evolution of this framework?
What is the work of hive/hcatalog?
What is difference between regular file system and HDFS?
Define a worker node?
How to Delete directory from HDFS?
How to submit extra files(jars, static files) for MapReduce job during runtime?
Name the filter which accepts the page size as the parameter in hbase?
How many ways we can create rdd?
what is a datanode?
What is a column family in Cassandra?
What is a Hive variable? What for we use it?
Name the operating system(s) which are supported for production hadoop deployment?
Explain what happens if, during the PUT operation, HDFS block is assigned a replication factor 1 instead of the default value 3?
Name the two types of shared variable available in Apache Spark?
What happens if the preferred replica is not in the isr?