Session

Thinking Fast & Slow: How Databricks Built High-Speed and Deep Research Agents

Overview

ExperienceIn Person
TrackArtificial Intelligence & Agents
IndustryEnterprise Technology
TechnologiesUnity Catalog, Agent Bricks
Skill LevelAdvanced

In today's competitive agentic search landscape, two operating modes are mission-critical. The first is a low-latency, low-cost mode for consumer-facing scale. It must meet strict tail-latency budgets for real-time, high-throughput use cases at minimal per-query cost, without sacrificing quality.

The second is a compute-intensive deep research mode enabling expert-level analysis—from financial due diligence and technology mapping to clinical review and manufacturing diagnostics. The system must plan multi-step retrieval, triangulate sources, and synthesize coherent answers.

Using our Instructed Retriever and Aroll frameworks, we built a unified agentic harness supporting both modes in a single architecture. Our system sits at the Pareto frontier of cost, speed, and quality. Instant mode delivers single-digit second latency maintaining accuracy, while Thinking mode achieves SoTA-level performance across enterprise domains.

Session Speakers

Speaker placeholderIMAGE COMING SOON

Michael Bendersky

/Director (Research)
Databricks