Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) Since the data is replicated thrice in hdfs, does it mean that any calculation done on one node will also be replicated on the other two?
What are the complex datatypes in pig?
What are the different Eval functions available in Pig?
On what basis name node distribute blocks across the data nodes in HDFS?
Describe DataStaxOpsCenter?
List the steps in which Cassandra writes changed data into commitlog?
What are the filters are available in apache hbase?
What is the logistic regression?
What are the key features of any nosql database?
Explain the filter transformation?
What is the relationship between hdfs, hbase, pig, hive and azkaban?
How Mapper is instantiated in a running job?
What is Apache Spark Streaming?
Define data lake?
Write a short note on the disadvantages of mapreduce