Session

Production Patterns for Multi-Tenant AI: Scaling to 800 Banks on Databricks

Overview

Experience	In Person
Track	Artificial Intelligence & Agents
Industry	Enterprise Technology, Financial Services
Technologies	Unity Catalog, Lakebase
Skill Level	Intermediate

Serving 800 German banks on one AI platform required solving a fundamental multi-tenant challenge: maintaining strict data isolation while scaling to hundreds of QPS under banking regulations. We present two innovations:1. Metadata-driven capacity management: Delta tables track index capacity per endpoint in real-time. When thresholds are reached, the system provisions new Vector Search endpoints and rebalances indexes. This eliminates manual scaling interventions and provides a reusable pattern for any platform hitting infrastructure limits at scale.2. Hierarchical retrieval with RAPTOR: Plain Agents retrieve flat chunks. We implement recursive clustering and summarization to build tree structures, enabling retrieval across abstraction levels, from granular details to high-level themes. This approach significantly improves answer quality for complex financial documents.Takeaways: federated vector search patterns, metadata-driven auto-scaling architecture, RAPTOR implementation

Production Patterns for Multi-Tenant AI: Scaling to 800 Banks on Databricks

Overview

Session Speakers

Natasha Ueberschlag

Simon Schmitz