Lessons Learned from Migrating the Largest Immunization Analytics Platform
OVERVIEW
EXPERIENCE | In Person |
---|---|
TYPE | Breakout |
TRACK | Data Engineering and Streaming |
INDUSTRY | Health and Life Sciences, Public Sector |
TECHNOLOGIES | Delta Lake, ETL, Orchestration |
SKILL LEVEL | Intermediate |
DURATION | 40 min |
DOWNLOAD SESSION SLIDES |
Accenture recently migrated one of the U.S.'s largest immunization registries to Databricks. The registry manages over 50 million individuals with nearly one billion records. The migration involved implementing SCD Type 2 tables using Delta Live Tables (DLT) and change data capture from an Oracle database. Insights from the implementation include: Know Your Data - Understanding the nature of updates is crucial. Understand File Structure Impact on Performance - SCD Type 2 requires multiple operations that depend on the organization of the data. Consider Your Compute Requirements: Crafting an effective cluster strategy requires balancing cost and meeting SLAs. Decouple Unrelated Workflows: We successfully decoupled workflows, optimizing computing resources for critical functionality. We anticipate features like Liquid Clustering and serverless computing to enhance our immunization registry analytics platform's efficiency in managing vast healthcare datasets.
SESSION SPEAKERS
Michael Pisarsky
/Solution Architect
Mosaic Data Solutions
Rex Phillips
/Strategy Senior Principal
Accenture