Session

Sponsored by: Datadog | 4,500 Jobs, No Blind Spots: AccuWeather's Serverless Databricks Migration with Datadog

Overview

ExperienceIn Person
TrackData Engineering & Streaming
IndustryCommunications, Media & Entertainment
TechnologiesDelta Sharing, Lakeflow
Skill LevelIntermediate

AccuWeather delivers forecasts to 1.5 billion people daily. When their pipelines fail, severe weather warnings don't reach the people who need them. The team used to wait 90 minutes before responding to any alert. Not because fixes were hard, but because monitoring that fires on every transient failure trains teams to ignore everything. This session shows how AccuWeather migrated to Serverless Databricks Workflows and operated approximately 4,500 weekly Lakeflow Jobs with Datadog Data Observability. We’ll cover how AccuWeather using serverless cut compute costs in half and the pre-flight pattern that catches failures before expensive cluster jobs run. Additionally, Datadog turns Databricks system table data into alerts built around weather-specific logic leading to over 50% fewer unactionable alerts and incident response 80% faster. This is due to firing incidents only after five failures in a 45-minute window, and only when multiple forecast models fail the same critical hour.

Session Speakers

Speaker placeholderIMAGE COMING SOON

Travis Teague

/Data Operations Manager
AccuWeather

Speaker placeholderIMAGE COMING SOON

Ryan Warrier

/Sr. Product Manager
Datadog