Product descriptions:
PepsiCo operates one of the world’s largest food and beverage portfolios, spanning over 120 brands across 200 countries and generating massive operational, commercial, and consumer data every day. After decades of growth across Red Brick, DB2, Teradata, Oracle, Presto, Synapse, and sector-specific data lakes, the company was weighed down by duplicated data, inconsistent KPIs, and BI teams still relying heavily on Excel. PepsiCo required a unified warehouse layer to replace its legacy systems, enhance performance, and establish a single, governed foundation for analytics and AI. By consolidating on Azure Databricks SQL as its enterprise data warehouse and BI serving layer, the team has reduced costs by up to 80% on key workloads, improved processing speed by 50%, and begun retiring Synapse and Teradata in favor of one trusted enterprise data foundation. Today, finance, sales, and field operations teams can access consistent, real-time metrics to drive faster decisions across the global business.
Modernizing a 27-year-old BI estate into a unified warehouse foundation
PepsiCo’s BI environment had grown across Red Brick, DB2, Teradata, Oracle, Presto, Synapse, and multiple sector data lakes—each holding overlapping versions of critical data. These layers slowed performance, created conflicting KPIs, and forced teams to rely on Excel for answers. As Joshua Lee put it, “Data duplication was everywhere… and literally data governance was not there, period.” Moving to Databricks SQL and Unity Catalog provided PepsiCo with a path to unify lineage, access, and metric definitions under a single, governed warehouse layer.
With DBSQL as the standard warehouse engine, PepsiCo began consolidating global and sector data into a single enterprise data foundation. The shift replaced fragmented BI logic with curated, reusable datasets designed for consistent consumption across Tableau, Power BI, APIs, and AI workloads. The new structure reduced shadow reporting, ended debates over whose spreadsheet was correct, and established one trusted version of the truth for teams operating in more than 20+ markets in North America.
From First Gen Architecture

To Next Gen Architecture

Delivering faster, more reliable analytics for finance and commercial teams
Finance and sales teams were among the first to benefit from the new warehouse. Previously, key dashboards refreshed slowly and often conflicted with parallel spreadsheet logic maintained by regional teams. Databricks SQL allowed PepsiCo to rebuild financial and commercial tables once and serve them to all downstream tools with consistent definitions. As John Abraham, Director of Data Product Architecture, explained, “We calculated all the metrics in one place… so it avoids confusion of which spreadsheet was correct.” That clarity dramatically improved reporting accuracy for global finance.
Commercial teams saw equally meaningful gains. Mobile sales applications that previously depended on strained legacy systems now pull governed metrics directly from DBSQL. Field reps gained timely insights for customer conversations, forecasting, and pricing decisions, supported by unified data rather than intuition. The move to DBSQL also laid the groundwork for real-time alerts, suggested ordering, and other AI-assisted capabilities that had been impossible without a trusted warehouse layer.
Cutting warehouse TCO by retiring Synapse and consolidating on DBSQL
PepsiCo’s North America analytics stack relied heavily on Synapse, which required expensive reserved capacity and constant data movement from the lake into a secondary warehouse layer. The team replaced Synapse with Databricks SQL serverless to eliminate redundant pipelines and regain cost flexibility. As Joshua noted, “It costs half a million dollars with a three-year reserve pricing… we were able to reduce that to using pay-as-you-go Databricks SQL serverless at 175K with the same, without performance degradations.” This established a repeatable pattern for broader warehouse retirement.
Retiring Synapse simplified Power BI architecture, cutting out entire refresh stages and serving reports directly from governed gold tables. Performance matched legacy expectations while dramatically reducing TCO. PepsiCo is now applying this template across 46 additional Synapse environments and preparing to decommission Teradata. Consolidating analytics on DBSQL reduces friction, lowers operating costs, and positions PepsiCo to scale BI and AI workloads on a single warehouse engine.
How PepsiCo migrated: from siloed lakes and warehouses to one enterprise data foundation
PepsiCo approached migration as a structured modernization rather than a direct lift-and-shift. The team began by merging global and sector data lakes, normalizing table structures, and assigning clear ownership through Unity Catalog. From there, they rebuilt high-value gold datasets as governed products and validated them through early finance and sales migrations. As Joshua summarized, “We’re dedicating Databricks SQL as our main data warehouse to source all our consuming products.” That decision anchored every downstream move.
As patterns proved reliable, PepsiCo expanded migrations to real-time sales APIs, Tableau workloads, and large Power BI estates. They optimized performance by making improvements to file layout, clustering, and query design, and collaborated closely with Databricks and Microsoft to resolve concurrency and driver-related issues. With these foundations in place, PepsiCo is now migrating Teradata and 46 additional Synapse systems into a single enterprise data foundation on Azure Databricks—clearing the path for consistent BI, lower TCO, and AI-driven analytics at scale.
