Skip to main content
AccuWeather

CUSTOMER
STORY

Transforming weather forecasting with Lakeflow Jobs

3x

Faster dataset development

50%

Reduction in unactionable alerts

50%

Cost savings on serverless job usage

customer AccuWeather B og image

Product descriptions:

AccuWeather powers global forecasts with Databricks

AccuWeather, recognized and documented as the most accurate and most used source of weather forecasts and warnings in the world, has saved over 12,000 lives, prevented injury to over 100,000 people, and saved companies tens of billions of dollars through better planning and decision-making.

Billions of people around the world rely on AccuWeather’s proven Superior Accuracy™ across their consumer digital platforms.

AccuWeather For Business serves more than half of the Fortune 500 companies and thousands of other businesses and government agencies globally who pay for the most accurate weather forecasts than from any other known source.

Faced with the challenge of ingesting, processing and storing immense volumes of diverse weather data at scale, AccuWeather turned to Databricks and Lakeflow Jobs, working with Datadog for observability.

Managing complex weather data at unprecedented scale

AccuWeather has three critical data use cases that power its global forecasting services. The company maintains 30 years of historical weather data for customer analysis, processes real-time observational data for live streaming to customer apps and generates proprietary forecasts by blending multiple weather models with expert meteorologist input—a unique approach that sets them apart from competitors who rely on single data sources. “There are hundreds of parameters that go into forecasting the weather, and when it comes to forecast models, you have hundreds of those that are all weighted differently,” explains Travis Teague, Data Operations Manager. “Measuring, explaining and forecasting weather requires continuous collection and processing of all of these models in near real time.”

The company faced two primary challenges with its on-premises infrastructure. First, data volume constraints forced them to regularly purge valuable information, preventing historical analysis of model performance. “If we wanted to do analysis on a model run that came in and see if it was worse than what’s coming in now, we can’t do that because we have to purge that data from our on-prem systems because we just don’t have capacity for it,” says Teague.

The second challenge involved managing numerous data sources without a single source of truth. Multiple teams were pulling the same data for different purposes, creating expensive data silos and redundant processing costs. The situation was complicated by weather data’s unique characteristics, which consist of highly specialized file formats not found in any other industry and that require custom tools for processing.

AccuWeather needed a solution that could automate its complex data workflows, eliminate the manual planning and validation steps that consumed days of effort, and orchestrate seamless job dependencies across their diverse weather datasets.

Unifying complex weather data with Databricks and Lakeflow Jobs

AccuWeather migrated to Databricks to leverage cloud scalability, eliminate on-premises storage constraints and consolidate its fragmented data sources into a unified platform. To address its complex orchestration challenges, AccuWeather selected Lakeflow Jobs as the primary workflow engine to automate the manual planning and validation processes that were consuming days of effort each month. Lakeflow Jobs enabled AccuWeather to orchestrate seamless dependencies across 4,500+ weekly jobs, automatically trigger downstream processes when new weather data arrived and eliminate the cumbersome manual interventions that had previously plagued their data pipelines. The platform's ability to consolidate all weather data formats—from highly structured forecast data to unstructured radar images in PNG or GeoTIFF formats—into a central data lake created the single source of truth that had been missing from their fragmented on-premises setup.

Databricks also helped AccuWeather address its data volume challenges. Because Databricks is a proven platform for enterprise scalability and reliability, AccuWeather can now efficiently distribute large workloads across scalable clusters, which provides them a platform to leverage AI and build their own models. Lakeflow Jobs’ serverless capabilities mean the team no longer worries about server patches, software upgrades or managing underlying Spark versions and they can process massive amounts of data at scale quickly and cost efficiently.

The platform’s flexibility allows AccuWeather to install specialized weather industry tools on traditional clusters, enabling them to work with highly specialized data formats. “Our data is very unique, and Databricks can handle that,” says Teague

Lakeflow Jobs orchestrates the complex multi-step workflows that power AccuWeather's data products, automatically triggering job sequences as soon as new data becomes available. For historical data processing, Lakeflow Jobs manages the intricate pipeline dependencies where observational data feeds into initial historical records, then triggers downstream jobs that enrich and validate the datasets before making them available through Delta Sharing. The orchestration capabilities have been crucial for AccuWeather's advanced forecasting initiative, where Lakeflow Jobs coordinates the ingestion of multiple weather models, triggers machine learning processes that weight and blend different forecasts and manages the complex job dependencies required for reinforcement training workflows. By automating these previously manual orchestration tasks, Lakeflow Jobs enables AccuWeather to deliver real-time weather intelligence to millions of users while building more sophisticated forecasting capabilities

Faster development and enhanced weather intelligence

The migration to Lakeflow Jobs delivered immediate operational improvements that accelerated AccuWeather's dataset development from three months per dataset to one month, enabling faster time-to-market for new weather data products. Lakeflow Jobs eliminated the need for manual intervention, saving an average of one to two days of processing time per job. The team no longer needs to manually validate data before and after processing. Serverless compute handles pre-checks to ensure all source weather data files have arrived from providers and post-checks to validate that processed datasets meet accuracy and completeness requirements before triggering downstream jobs. One critical job converted entirely to serverless Lakeflow Jobs cut usage costs in half, while reducing their reliance on Azure Data Factory has reduced both complexity and licensing expenses.

Development productivity has surged thanks to Databricks Asset Bundles, which provide template repositories for new projects with built-in CI/CD deployment. "Lakeflow Jobs saves us at least 10% of hours of work every month," explains Teague. This time was previously spent on manual orchestration tasks including planning deployment schedules, coordinating job timing across teams, manually triggering dependent jobs when upstream processes completed, and troubleshooting workflow failures that required restarting entire job sequences. "That time saved means it increased our development lifecycle speed. Instead of spending three months on one data set, we can now do one data set a month."

This acceleration has also enabled AccuWeather to expand their team and work on multiple datasets simultaneously instead of focusing the entire team on a single project.

The faster dataset development cycles and automated orchestration directly impact AccuWeather's ability to continue delivering their forecasts with proven Superior Accuracy™ and often providing more advanced warnings than any other known source that save lives, protect property, and help people make the best weather-impacted decisions. The team can now bring in more raw datasets to enrich their historical products and incorporate additional models into their forecasting engine, helping to make their weather intelligence the most accurate and the most comprehensive. Faster iteration through datasets means more achievable deadlines, rapid development cycles, and richer data for customers who rely on AccuWeather's services.

Meanwhile, Databricks’ partnership with observability platform provider Datadog has ensured pipeline reliability and performance optimization. Datadog’s Data Jobs Monitoring provides unified visibility into Databricks jobs and workflows, helping AccuWeather detect failures and latency spikes faster. "We have reduced unactionable alerts by over 50% through these correlated and aggregated alerting monitors," notes Teague. "Before, our normal incident response time was around an hour and a half. Now it's reduced to just a couple of minutes."

When monitoring Lakeflow Jobs, Datadog applies domain-specific business logic to detect critical issues, such as repeat job failures, and understand the root cause of the failure. When thresholds are reached, alerts are automatically routed to the owning teams enabling faster investigation and resolution. Datadog also helps rightsize AccuWeather’s Databricks environment and improve job performance by surfacing idle compute, cluster utilization and Spark execution metrics.

The combined improvements have transformed how AccuWeather delivers weather intelligence to millions of users worldwide. As Teague summarizes, "Databricks and Lakeflow Jobs’s scalability, availability and overall performance has helped AccuWeather meet its mission to save lives, protect property, and help people prosper.”