Prepare Interview

Mock Exams

Make Homepage

Bookmark this page

Subscribe Email Address

Question: What is the purpose of the 'accumulator' in PySpark?
Answer: An 'accumulator' is a variable that can be used in parallel operations and is updated by multiple tasks. It is typically used for implementing counters or sums in distributed computing.

Example:

accumulator = spark.sparkContext.accumulator(0)

# Inside a transformation or action
accumulator.add(1)
Is it helpful? Yes No

Most helpful rated by users:

©2025 WithoutBook