Broadcast variables in Apache Spark are used for sharing large amounts of r

Can you explain broadcast variables?

Question Posted / Ravi Ranjan Kumar

1 Answers
319 Views
I also Faced
E-Mail Answers

Answer Posted / Ravi Ranjan Kumar

Broadcast variables in Apache Spark are used for sharing large amounts of read-only data across worker nodes during the computation. They are useful when the same data needs to be accessed by multiple tasks in parallel and the data is too large to fit into the memory of a single node. When a broadcast variable is created, it gets replicated on all the worker nodes, but only one copy of the data is sent to each node. This saves network bandwidth as compared to sending the same data to each task separately.

Is This Answer Correct ?

0 Yes

0 No

Post New Answer View All Answers

Please Help Members By Posting Answers For Below Questions

Explain how RDDs work with Scala in Spark

355

What is meant by Transformation? Give some examples.

328

What is the latest version of spark?

288

List the advantage of Parquet file in Apache Spark?

474