What is the difference between DAG and Lineage?
Answer / Deepak Singh Negi
"DAG (Directed Acyclic Graph) refers to the logical structure of transformations in Apache Spark, while Lineage represents the history of each RDD’s creation and all its transformations. In other words, the DAG describes the computation flow, and the Lineage keeps track of the data lineage."
| Is This Answer Correct ? | 0 Yes | 0 No |
What are the features of spark rdd?
What is meant by Transformation? Give some examples.
What is a Sparse Vector?
What are the common mistakes developers make when running Spark applications?
Please provide an explanation on DStream in Spark.
How do you parse data in xml? Which kind of class do you use with java to pass data?
How can you store the data in spark?
Do I need to install hadoop for spark?
What are the various functions of Spark Core?
What is the function of "MLlib"?
Which spark library allows reliable file sharing at memory speed across different cluster frameworks?
Explain the default level of parallelism in Apache Spark
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)