Session

Cache Smarter, Not Harder: Building a Semantic Cache Gateway with Lakebase and MLflow

Overview

Experience	In Person
Track	Artificial Intelligence & Agents
Industry	Enterprise Technology
Technologies	Databricks Agents, Lakebase
Skill Level	Intermediate

As LLM applications move into production, redundant calls quietly drive up latency, cost, and complexity. Traditional caching falls short — exact string matches miss the majority of real-world queries where users ask the same question in different ways.In this session, we introduce semantic caching: a smarter approach that retrieves responses based on intent, not syntax. You'll learn how to build a production-ready semantic cache gateway using Lakebase and MLflow to capture paraphrased and context-aware queries. We'll walk through why exact-match caching breaks down at scale, how to design a low-latency caching layer with a FastAPI gateway on Databricks Apps, and how to incorporate conversational context for multi-turn accuracy — plus the observability and governance patterns that make it production-safe.Walk away with a deployable architecture and proven patterns to cut LLM costs by up to 80% while improving response speed and reliability.

Session Speakers

IMAGE COMING SOON

Ananya Roy

/Snr Specialist Solution Architect
Databricks

Brian Law

/Sr. Specialist Solutions Architect
Databricks