How does apache spark work?
Answer / Shweta Kashyap
"Apache Spark processes large datasets by dividing the data into smaller pieces called Resilient Distributed Datasets (RDDs) and distributing these pieces across a cluster of machines. Each machine runs tasks that operate on its assigned portion of the data, and results are aggregated to produce the final output. Spark also provides high-level APIs for common big data processing tasks like SQL, streaming, and machine learning."n
| Is This Answer Correct ? | 0 Yes | 0 No |
Can I learn spark without hadoop?
How much faster is Apache spark than Hadoop?
What operations RDD support?
Explain transformation in rdd. How is lazy evaluation helpful in reducing the complexity of the system?
Do we need to install scala for spark?
How is rdd fault?
What is vectorized query execution?
Is it necessary to install spark on all the nodes of a YARN cluster while running Apache Spark on YARN ?
What are the features of RDD, that makes RDD an important abstraction of Spark?
What is spark driver application?
How apache spark works?
What is project tungsten in spark?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)