CUSTOMER
STORY

Driving smarter retail with real-time customer insights

Unilever streamlines data workflows with Lakeflow Declarative Pipelines to deliver trusted, real-time insights

25%

Reduction in overall infrastructure costs

200-500%

Time savings from pipeline and data processing efficiency

Product descriptions:

Lakeflow Declarative Pipelines Lakeflow Jobs Unity Catalog

Unilever is one of the world’s largest consumer packaged goods (CPG) companies, selling over a billion products daily across brands like Dove, Hellmann’s and Tresemmé. With that global scale came massive volumes of complex data from diverse sources, as well as legacy data pipelines that created bottlenecks, delayed insights and made AI initiatives difficult to scale. By adopting Databricks Lakeflow Declarative Pipelines, Unilever simplified their architecture, improving data quality and unifying into a single, scalable framework. Now, Unilever can move at the speed of their customers, delivering smarter insights and AI-driven decisions across the business.

The cost of infrastructure complexity and slow insights

Unilever’s Customer Analytics team plays a critical role in helping the business understand how their products perform across retailers and regions. As consumer demand became more dynamic and retail partners expected faster, data-driven insights, the company’s existing data infrastructure began to hold them back. Teams were spending too much time managing pipeline issues instead of delivering value. Data was often delayed, hard to trace and difficult to trust. Furthermore, without real-time access, the business couldn’t respond quickly enough to emerging trends or shifts in shopper behavior.

“We needed to move from fragmented, reactive processes to a real-time, AI-ready foundation,” Evan Cherney, Senior Data Science Manager at Unilever, said. “Our teams were stuck maintaining pipelines instead of creating insights, and that limited the impact we could have.”

Unilever’s existing data ecosystem spanned petabytes of information from internal sources, third-party partners and external vendors, streaming in at varying intervals and formats. This included everything from point-of-sale and shipment data to shopper demographics, geolocation and product attribution. Engineering teams relied on Spark jobs triggered through external orchestration tools, but the environment had grown too complex to manage efficiently. Pipelines were fragile and interdependent. Debugging issues was time-consuming, and resolving one failure often meant updating multiple systems to get reports back online.

Additionally, business users struggled to access the data they needed when they needed it. There was limited visibility into the origin of the data, its transformation process or its reliability. Trust in the numbers was questioned, and the organization was missing opportunities to move faster, scale smarter and make more informed, data-driven decisions. The team knew that bringing the right technology to the data, along with their architecture design, would prove to be a significant step forward for their organization.

Building a scalable, AI-ready data foundation

To meet rising data demands and deliver insights faster, Unilever rebuilt their data architecture around the Databricks Data Intelligence Platform and Lakeflow Declarative Pipelines. The goal was to simplify complexity, reduce engineering overhead and enable real-time, AI-ready analytics. With Lakeflow Declarative Pipelines, Unilever moved from manually orchestrated pipelines to declarative workflows that automatically handle dependencies, enforce data quality and provide complete visibility into how data flows through the system.

Adopting a medallion architecture with Bronze, Silver, Gold and Platinum layers allowed the team to structure data in a way that aligned with both performance and governance needs. Batch and streaming data now flow through a single, resilient pipeline, with quality checks and transformations handled natively. “Instead of stitching together tables manually and worrying about refresh dependencies, we now focus on delivering insights. Lakeflow Declarative Pipelines manages the flow, handles quality checks and gives us the transparency we never had before,” Cherney said.

Governance was another significant leap forward. Unity Catalog provided centralized access control, column-level lineage and secure data sharing across teams. Data that was once fragmented and difficult to trace is now discoverable, trusted and easily accessible for exploration. Teams can collaborate with confidence, knowing they’re working with consistent, high-quality data.

Performance improvements followed quickly. Serverless compute, liquid clustering and direct publishing streamlined both pipeline execution and data delivery. Instead of maintaining dozens of custom pipelines, Unilever now routes data dynamically using metadata. Dashboards update in near real time, AI models train on fresher inputs and the company can respond to market signals faster than ever before. What was once a high-effort system to maintain is now a flexible platform designed for scale, speed and smarter decisions.

Unlocking speed and scale through modern data architecture

Since migrating to the Databricks Data Intelligence Platform, Unilever has significantly enhanced their ability to respond to market shifts, retailer needs and consumer behavior in real time. Pipeline refreshes and business-critical dashboards that used to take hours or days to deliver can now happen in minutes. Teams across the organization are making decisions faster, powered by trusted, high-quality data that’s always up to date and easy to trace.

Real-time insights have had a direct impact on Unilever’s collaboration with retail partners. Sales, supply chain and marketing teams now have access to continuously refreshed, curated data to support everything from inventory optimization to campaign performance. AI and ML models that were slowed down by inconsistent inputs now run on streaming data with minimal latency, enabling more intelligent demand forecasting and more adaptive planning cycles.

The technical gains also translated into clear operational wins. Unilever significantly reduced their infrastructure costs. “Databricks helped us move to serverless compute, while eliminating redundant workflows. These efficiencies put us in position to lower operational costs by 25%,” Cherney said. Data pipelines are also much more performant. “Pipelines on our legacy infrastructure previously took hours to process. Now, they run 2 to 5 times faster,” Cherney explained. Engineering teams that were once focused on maintenance and debugging have shifted their energy to innovation, helping the business uncover new opportunities and scale data products more efficiently.

Cherney said, “Lakeflow Declarative Pipelines has completely changed how we work. We’re no longer stuck chasing broken pipelines. We’re delivering real-time insights, enabling AI at scale and helping our business move faster than ever.”

While the shift to Databricks focused on improving performance, Unilever was also building a data foundation ready for what’s next. With observability, lineage and embedded quality checks in place, Unilever now has a platform that can scale with future AI innovations, support experimentation across teams and deliver trusted insights in the moment they’re needed most.

Share this post

Details

Ready to get started?

Try Databricks for free Learn more about our product Talk to an expert