Session

Saving Millions From Millions: Navigating Towards Cost-Efficiency in Pinterest's Spark Jobs

Overview

ExperienceIn Person
TypeBreakout
TrackData Engineering and Streaming
IndustryEnterprise Technology, Professional Services, Retail and CPG - Food
TechnologiesApache Spark
Skill LevelIntermediate
Duration40 min

While Spark offers powerful processing capabilities for massive data volumes, cost-efficiency challenges are always bothering users operating at large scales. At Pinterest, where we run millions of Spark jobs monthly, maintaining infra cost efficiency is crucial to support our rapid business growth.

 

To tackle this challenge, we have developed several strategies that have saved us tens of millions of dollars across numerous job instances. We will share our analytical methodology for identifying performance bottlenecks, and the technical solutions to overcome various challenges. Our approach includes extracting insights from billions of collected metrics, leveraging remote shuffle services to address shuffle slowness and improve memory utilization and reduce costs while hosting hundreds of millions of pods.

 

The presentation aims to trigger more discussions about cost efficiency topics of Apache Spark in the community and help the community to tackle the common challenge.

Session Speakers

IMAGE COMING SOON

Nan Zhu

/Staff Software Engineer
Pinterest