Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
How will you calculate the number of executors required to do real-time processing using Apache Spark? What factors need to be considered for deciding on the number of nodes for real-time processing?
What is spark job?
Can we run spark on windows?
Explain Spark coalesce() operation?
What does hadoop-env.sh do?
Does HBase support SQL like syntax?
Is there any difference between HBase datamodel and RDBMS datamodel?
Is it possible to provide multiple input to Hadoop? If yes then how can you give multiple directories as input to the Hadoop job?
Name some Complex types of Data types, Avro Supports?
What is dag spark?
What is jmx? And how is it useful in cassandra?
How tables are managed in apache tajo?
How does Cassandra delete data?
What file systems Spark support?
What is sharding in big data?