Session

Building Production-Ready Agents with a Self-Evolving Test Suite

Overview

ExperienceIn Person
TrackArtificial Intelligence & Agents
IndustryEnterprise Technology
TechnologiesDatabricks Agents
Skill LevelIntermediate

Building agents is changing. Most teams iterate on agent quality by vibe-checking. Try a question, fix what looks wrong, ship. It works at first and breaks down fast. Each fix can revive an old bug. Coding agents make this worse by rewriting the agent in seconds while verification stays manual.

This session introduces a new best practice. You build your agent against a test suite that grows itself from inside the vibe-check loop. Every piece of feedback on a bad answer becomes an automated test. Every coding-agent fix runs against the accumulated suite. By the time your agent is ready to ship, you have a real test suite that grew alongside the build.
We walk through this on stage end to end. We start with a prototype agent that has no tracing and no eval, and we end with a production-ready agent backed by a test suite that grew alongside the build.

Session Speakers

Speaker placeholderIMAGE COMING SOON

Adam Gurary

/Senior Product Manager
Databricks

Speaker placeholderIMAGE COMING SOON

Yuki Watanabe

/Senior Software Engineer
Databricks