Building Agent-Ready Data: Best Practices for Intelligent Document Processing at Scale

Overview
| Experience | In Person |
|---|---|
| Track | Artificial Intelligence & Agents |
| Industry | Enterprise Technology, Healthcare & Life Sciences, Financial Services |
| Technologies | Databricks SQL, Databricks Agents |
| Skill Level | Intermediate |
Roughly 80% of the context that powers your business—invoices, contracts, call transcripts, support tickets—lives in messy, unstructured data and documents. Turning that into a structured data layer is one of the biggest opportunities in your organization: it's what lets you build accurate and reliable agents and analytics on your enterprise's full context.
However, when building these massive scale pipelines, going from prototype to production is harder than it looks. Accuracy, cost, and throughput stop being independent dials and become a three-way tradeoff you have to design around. In this session, we'll share the framework and the playbook for winning that tradeoff. We'll show you how to compose AI Functions—Databricks' research-backed, task-specific building blocks for intelligent document processing—into production-ready agentic workflows. You'll learn how to choose the right function for each task, iterate on quality effectively, and tune for cost and throughput at scale. Along the way, we'll demo new features across AI Functions that push what's possible.
You'll walk away with a concrete decision framework for building large-scale agentic workflows over unstructured data, plus the best practices to get them into production.
Session Speakers
Archika Dogra
/Product Manager
Databricks
Nihit Desai
/Software Engineer
Databricks