What is difference between coalesce and repartition?
Answer Posted / Amit Kumar Singh
Coalesce operation in Apache Spark is used to reduce the number of partitions for a DataFrame or RDD, while maintaining their existing order. It combines the contiguous partitions. On the other hand, repartition operation is used to change the total number of partitions for a DataFrame or RDD, which might cause the shuffle of data, and can affect the performance due to increased network traffic.
| Is This Answer Correct ? | 0 Yes | 0 No |
Post New Answer View All Answers