HomepageData + AI Summit 2023 Logo
JUNE 26-29, 2023
Attend Live

Moving from Apache Spark 2 to Apache Spark 3: Spark Version Upgrade at Scale in Pinterest

Wednesday, June 29 @2:50 PM


Apache Spark has become Pinterest’s dominant distributed batch processing framework. As the age of Spark 3 is coming, most of Pinterest’s Spark applications still run on Spark 2, and Pinterest is migrating its Spark Platform and most production Spark jobs to Spark 3. In this talk, we’ll share how Pinterest performed the Spark 3 version migration at scale. Moving to Spark 3 is a huge version upgrade that brings many incompatibilities and major differences compared with Spark 2. We’ll first introduce the motivation of the migration, then talk about the major challenges, approaches we took, how we handled different Spark job types during the migration, how we address the incompatibilities between Spark 2 and Spark 3, like Scala version support, and how we efficiently and safely migrated our existing production Spark jobs at scale without impacting stability & SLO with the help of Auto Migration Service (AMS). We’ll then further discuss our current performance improvements, cost saving, as well as the future plans and improvements that we’ll work on.

After attending this session, you’ll have a better understanding of the challenges required to perform the Spark 2 to Spark 3 migration at scale, furthermore, you’ll be able to utilize the experiences and considerations shared in this session to move to Spark 3 for your users in a smooth and stable manner.


  • Session


  • Hybrid


  • Data Engineering


  • Intermediate


  •  Moscone South | Level 2 | 215


  • 35 min

Session Speakers

Headshot of Zaheen Aziz

Zaheen Aziz

Software Engineer


Headshot of Zirui Li

Zirui Li

Software Engineer


See the best of Data+AI Summit

Watch on demand