Session

Intelligent Document Processing with Lakeflow: Transforming Unstructured Data for Agents and Analytics

Overview

ExperienceIn Person
TrackData Engineering & Streaming
IndustryHealthcare & Life Sciences, Manufacturing, Financial Services
TechnologiesLakeflow, Databricks Agents
Skill LevelIntermediate

A huge portion of the data that runs your business doesn't live in clean tables — it lives in millions of documents: invoices, contracts, forms, and filings. This is some of the richest context your enterprise has, and today's agents finally have a shot at unlocking it. But there's a catch: agents can't actually read your documents. On our OfficeQA benchmark, even frontier agents score below 50% on real enterprise document tasks because they struggle to read them. Close that gap, and every agent, dashboard, and decision downstream gets sharper.

In this session, we'll show how Document Intelligence on Databricks turns unstructured documents into accurate, governed insights that power more accurate agents and analytics. We'll walk through how to build intelligent document processing pipelines with Lakeflow and AI Functions purpose-built for document understanding — taking your documents from raw PDF to rich answers in Genie.

Walk away knowing how to turn your document corpus into the data layer that powers smarter, faster, and more trustworthy agents and analytics.

Session Speakers

Speaker placeholderIMAGE COMING SOON

Archika Dogra

/Product Manager
Databricks

Speaker placeholderIMAGE COMING SOON

Jason Ping

/Product Manager
Databricks