Session
Building the Commercial Lakehouse: AI-Driven, Open, and Interoperable
Overview
| Experience | In Person |
|---|---|
| Track | Governance & Security |
| Industry | Health & Life Sciences |
| Technologies | Delta Sharing, Lakeflow, Unity Catalog |
| Skill Level | Intermediate |
AstraZeneca modernized its Commercial data estate, transitioning from legacy EMR to a high-performance Global Commercial Lakehouse. At the core lies Unity Catalog (UC), established as the unified governance layer. We demonstrate how UC enables a true open architecture, orchestrating seamless federation for legacy Glue workloads and utilizing Iceberg REST integration to provide zero-copy access for iceberg clients eg Snowflake.The transformation was powered by a multi-agent AI migration system, utilizing LLMs and testing agents to automate the conversion of 250+ pipelines into a rigorous Medallion architecture. Key components include Databricks Asset Bundles for versioned CI/CD, Liquid Clustering for storage optimization.Processing complex external data sources (Claims, CRM, Digital), this platform delivered 50% cost savings. This session covers the multi-agent migration architecture, UC-centric interoperability design patterns, and lessons from modernizing commercial data at scale.
Session Speakers
Paul Kuntz
/Commercial Data Lake Capability Lead
AstraZeneca Pharmaceuticals