Session
Beyond Batch: Engineering Self-Evolving Ingestion with Databricks Auto Loader
Overview
| Experience | In Person |
|---|---|
| Track | Data Engineering & Streaming |
| Industry | Enterprise Technology, Financial Services |
| Technologies | Databricks SQL, Delta Sharing, Unity Catalog |
| Skill Level | Intermediate |
Is your data engineering team trapped in brittle schema fixes and rising costs? As data sources multiply, traditional ingestion stalls enterprise insights. At Capital One Software, we’ve moved beyond static ETL to a "self-evolving" framework that treats data as a dynamic stream. This session reveals our modular architecture using Databricks Auto Loader to bridge multi-platform data into S3 with zero manual overhead. We’ll dive into how Schema Evolution and Rescue Columns handle upstream drift automatically, keeping pipelines live when source systems change.
Key Takeaways:
- Modular Design: Decouple ingestion from transformation for reusable, enterprise-scale patterns.
- Dynamic Schema Management: Strategies to detect and adapt to drift without breaking dependencies.
- Cost Optimization: Real-world tactics for balancing trigger intervals and compute for peak efficiency. Transition from reactive maintenance to automated data empowerment.
Session Speakers
Yudhish Batra
/Distinguished Engineer
Capital One
Syed Mehmood
/Director of Software Engineering & Data
Capital One