What is RDD lineage graph? How does it enable fault-tolerance in Spark?
Answer Posted / Abhik Das
{"rdd lineage graph": "A directed acyclic graph (DAG) that represents the ancestry of Resilient Distributed Datasets (RDDs) in Apache Spark. Each RDD has a parent RDD, and together they form a lineage graph that tracks the data flow through transformations and actions.nnThe RDD lineage graph enables fault-tolerance by allowing Spark to recompute missing or failed parts of a job using the saved RDDs and their dependencies in the lineage graph."}
| Is This Answer Correct ? | 0 Yes | 0 No |
Post New Answer View All Answers