Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
What is dataproc cluster?
Please explain the sparse vector in Spark.
Can we deploy job tracker other than name node?
What is Small File Problem in Hadoop? How can it be resolved?
Does Spark provide the storage layer too?
How can we launch a tajo cluster?
What is the use of combiners in the hadoop framework?
Knox and Hadoop Development Tools?
Name the various types of lists supported by bootstrap.
What are the different levels of persistence in Spark?
how to share the metastore within multiple users?
What does a split do?
How will you calculate the number of executors required to do real-time processing using Apache Spark? What factors need to be considered for deciding on the number of nodes for real-time processing?
Define commit log?
What is dataframe api?