Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
What is the reason behind Transformation being a lazy operation in Apache Spark RDD? How is it useful?
What is the inputsplit in map reduce software?
Explain sum(), max(), min() operation in Apache Spark?
How many numbers of reducers run in Map-Reduce Job?
What are the differences between hadoop 1 and hadoop 2?
Which filter accepts the page size as the parameter in HBase?
How is 0xdata's h2o different from apache mahout ?
What are shared variables?
How does Cassandra write?
What is hadoop, hbase, hive and cassandra? Specify similarities and differences among them.
Define replication factor?
What is HDFS?
Explain HCatReader?
List out some common problems faced by data analyst?
Why do we need rdd in spark?