Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) Define the use of Source Command in Cassandra?
What are the different methods to run Spark over Apache Hadoop?
In which kind of scenarios MapReduce jobs will be more useful than PIG in Hadoop?
How will you submit extra files or data ( like jars, static files, etc. ) For a mapreduce job during runtime?
How much is flume worth?
What is Yum?
Define ttl in hbase?
In which directory hadoop is installed?
Since the data is replicated thrice in hdfs, does it mean that any calculation done on one node will also be replicated on the other two?
Why hbase is a schema-less database?
What is a spill factor with respect to the ram?
Can spark work without hadoop?
What is DistributedCache and its purpose?
Mention what is the maximum size of the message does kafka server can receive?
When to use spark sql?