Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) Different Running Modes of Apache Spark
Explain accumulators in apache spark.
What is a udf?
Is it mandatory to set input and output type/format in MapReduce?
What are partitions in cassandra?
Mention the difference between hbase and relational database?
Explain Thrift & Protocol Buffers Vs. Avro?
Why aggregation cannot be done in Mapper?
Explain a scenario where you will be using spark streaming.
What do you understand by composite type?
How can you use consumer api?
Difference between hbase and rdbms?
How mahout used with python ?
Write command to copy a file from HDFS to linux(local).
Clarify what is shuffling in map reduce?