Session

Production Patterns for Multi-Tenant AI: Scaling to 800 Banks on Databricks

Overview

ExperienceIn Person
TrackArtificial Intelligence & Agents
IndustryEnterprise Technology, Financial Services
TechnologiesUnity Catalog, Lakebase
Skill LevelIntermediate
Serving 800 German banks on one AI platform required solving a fundamental multi-tenant challenge: maintaining strict data isolation while scaling to hundreds of QPS under banking regulations. We present two innovations:1. Metadata-driven capacity management: Delta tables track index capacity per endpoint in real-time. When thresholds are reached, the system provisions new Vector Search endpoints and rebalances indexes. This eliminates manual scaling interventions and provides a reusable pattern for any platform hitting infrastructure limits at scale.2. Hierarchical retrieval with RAPTOR: Plain Agents retrieve flat chunks. We implement recursive clustering and summarization to build tree structures, enabling retrieval across abstraction levels, from granular details to high-level themes. This approach significantly improves answer quality for complex financial documents.Takeaways: federated vector search patterns, metadata-driven auto-scaling architecture, RAPTOR implementation

Session Speakers

Speaker placeholderIMAGE COMING SOON

Natasha Ueberschlag

/Manager, AI Forward Deployed Engineering
Databricks

Simon Schmitz

/Senior Data Scientist
Atruvia