Prepare Interview

Mock Exams

Make Homepage

Bookmark this page

Subscribe Email Address

Question: Explain the difference between 'cache' and 'persist' operations in PySpark.
Answer: 'Cache' is a shorthand for 'persist(memory_only=True)', while 'persist' allows more flexibility by specifying storage levels (memory-only, disk-only, etc.).

Example:

df.cache()
Is it helpful? Yes No

Most helpful rated by users:

©2025 WithoutBook