Skip to main content

Bayer Consumer Health scales global self-service analytics with Unity Catalog

Blog: Bayer Consumer Health scales global self-service analytics with Unity Catalog

Published: March 4, 2026

Healthcare & Life Sciences5 min read

Summary

• Bayer Consumer Health built a single, governed data platform using Databricks and Unity Catalog to eliminate data silos and enable global self-service analytics.
• With 7 business domains organized around shared core data assets, Bayer simplified data management and accelerated analytics delivery.
• A single reporting endpoint now enables convenient reporting across the complete data estate.

Bayer is a life sciences company and a global leader in healthcare and nutrition, active in over 100 markets across 83 countries. Guided by its mission — health for all, hunger for none — Bayer is setting out to give its 92,500 employees secure, discoverable access to data at scale. Five years ago, fragmented systems made this nearly impossible, and teams working for the Consumer Health Division were suffering from not being able to properly use data for decision-making. By adopting Databricks and Unity Catalog, Bayer Consumer Health built a single, governed data platform that enables self-service analytics without data silos.

With Databricks, we are building reusable core assets, enabling self-service analytics and fostering a data-driven organization that provides insights for all and data silos for none.— André Wuthenow, Principal Cloud Platform Architect, Bayer 

Global fragmentation and “data tourism”

As a globally distributed company, Bayer’s previous data analytics setup was fragmented across markets, with each using its own tech stack for different purposes. When data needed to be shared, it was often copied, sometimes multiple times, in what Bayer calls “data tourism.” Data tourism led to increased data management costs and slower implementation of new solutions. This complexity, along with performance issues, led to low adoption of the solutions Bayer IT could provide and challenged the company’s ability to make data-driven decisions. Beyond cost and performance, data tourism made it difficult to understand who was using which data, enforce consistent access controls, or confidently reuse trusted assets across markets.

In addition, Bayer faced significant challenges in leveraging the latest data analysis tools, such as machine learning. “The systems needed to support machine learning added an additional cost and maintenance burden because we needed to move machine learning to a completely dedicated platform on a different technology stack, in a different data center, on a different type of scaler — so we couldn’t really properly use machine learning at that point in time,” said André Wuthenow, Principal Cloud Platform Architect, at Bayer.

When looking for a solution to these challenges, the Bayer Consumer Health Data & Analytics organization knew they needed to build a global, scalable data platform. With over 2,000 business users and 25 zones running across three global regions, supported by more than 250 machine learning and data engineers, Bayer needed a cloud-based system that could leverage serverless technology where possible. “It was important to make sure that our solutions scale with any data volume and number of simultaneous users to ensure everyone gets the best performance and immediate results” said Wuthenow.  A cloud-based solution would also be fiscally responsible, ensuring Bayer only pays for what it uses, and would give the company the flexibility to try out new services on a small scale before rolling them out as a global standard.

REPORT

Data intelligence reshapes industries

Template-based environments in Databricks

Bayer Consumer Health selected Databricks as the foundation for their data platform, enhanced with Azure Services for data ingestion, storage and others. All the data transformation and data cleansing are done in Databricks, making sure that raw data is transformed into reusable, quality-checked and trusted data assets. With this solution, Bayer can also surface Azure ML and other Azure AI services for its developers to leverage.

Databricks provides a unified, integrated platform to address the needs of Bayer’s data engineers, whether they are building BI reports, ML solutions or analytical applications. With Databricks as its unified platform, Bayer can run multiple projects with many teams working in parallel without negatively affecting one another. Each team can independently manage the lifecycle of new data products. Knowing that its local markets would have unique data needs that differ from global analytics, a system was needed that would centralize all its data to avoid multiple copies and “data tourism,” while still providing flexibility for each team to leverage the data in ways that fit their markets. “We leveraged Databricks to create template-based environments with dedicated service instances that ensure proper resource isolation and lifecycle management,” said Wuthenow.  

Unity Catalog provides the centralized governance and metadata layer across these environments, allowing to govern core data assets once while enabling teams to securely consume and reuse them across projects and regions.

Faster data product implementation and self-service reporting

With the introduction of Unity Catalog as a replacement for their Hive Metastore, Bayer moved from a push-based to a pull-based data-sharing approach. Data consumers only require permission to access governed and trusted core data assets. Thus, each data domain team can define for itself what to share with whom, without copying data across environments. With the introduction of serverless in combination with Unity Catalog, Bayer Consumer Health enabled secure connectivity from their Development Environment to Production Core Data Assets. This enabled data engineers to build new solutions in their development environment with production-grade data, leading to faster time to market for analytics new solutions, while still enforcing data exfiltration measures. “Unity Catalog was a game changer for us,” said Wuthenow. “The new model makes it easy for us to ensure that data products in all stages have the latest data available, which speeds up building and testing of new solutions because engineers can use production-grade data to test their solutions.”

Bayer Consumer Health also introduced a central reporting endpoint that links to all their catalogs. As global Core Data Assets are managed in a single region, employees can easily discover and combine data across domains through a single, governed entry point, ensuring self-service analytics scales without reintroducing silos or inconsistent definitions.

With Databricks and Unity Catalog, Bayer Consumer Health established shared standards for data access, naming, and security while preserving flexibility. Governance is embedded into the platform rather than applied after the fact, allowing to scale self-service analytics with confidence. As Wuthenow puts it, “We are building reusable core assets, enabling self-service analytics and fostering a data-driven organization that provides insights for all, data silos for none.”

Never miss a Databricks post

Subscribe to our blog and get the latest posts delivered to your inbox

What's next?

Providence Health: Scaling ML/AI Projects with Databricks Mosaic AI

Data Science and ML

November 14, 2024/2 min read

Providence Health: Scaling ML/AI Projects with Databricks Mosaic AI

How automated workflows are revolutionizing the manufacturing industry

Product

November 27, 2024/6 min read

How automated workflows are revolutionizing the manufacturing industry