SESSION

Sponsored by: Sync Computing | Best Practices to Manage Databricks Clusters at Scale to Lower Costs

Accept Cookies to Play Video

OVERVIEW

EXPERIENCEIn Person
TYPEBreakout
TRACKData Engineering and Streaming
INDUSTRYEnterprise Technology, Health and Life Sciences, Financial Services
TECHNOLOGIESApache Spark, ETL, Governance
SKILL LEVELBeginner
DURATION40 min

Many companies quickly scale up on their Databricks usage to thousands of jobs, only to find themselves with ballooning costs and difficult to manage infrastructure. Platform teams often find themselves gridlocked with other groups and priorities, unable to act to resolve these problems. At Sync, we’ve worked with companies from startups to the fortune 100 on their Databricks usage, identifying common trends and pitfalls. In this talk, we’ll present common findings on both what practices work and what doesn’t work for their Jobs clusters, SQL warehouses, all purpose compute clusters, and ML workloads. We’ve observed companies save up to 75% with their Databricks spend by implementing various techniques to optimize performance. In this talk, we’ll also present Sync’s automated Databricks management solution, Gradient, which can automate many of the lessons learned here to help companies bring down costs at scale - automatically.

SESSION SPEAKERS

Jeff Chou

/CEO / Co-Founder
Sync Computing