Building Production-Ready Agents with a Self-Evolving Test Suite
Overview
| Experience | In Person |
|---|---|
| Track | Artificial Intelligence & Agents |
| Industry | Enterprise Technology |
| Technologies | Databricks Agents |
| Skill Level | Intermediate |
Building agents is changing. Most teams iterate on agent quality by vibe-checking. Try a question, fix what looks wrong, ship. It works at first and breaks down fast. Each fix can revive an old bug. Coding agents make this worse by rewriting the agent in seconds while verification stays manual.
This session introduces a new best practice. You build your agent against a test suite that grows itself from inside the vibe-check loop. Every piece of feedback on a bad answer becomes an automated test. Every coding-agent fix runs against the accumulated suite. By the time your agent is ready to ship, you have a real test suite that grew alongside the build.
We walk through this on stage end to end. We start with a prototype agent that has no tracing and no eval, and we end with a production-ready agent backed by a test suite that grew alongside the build.
Session Speakers
Adam Gurary
/Senior Product Manager
Databricks
Yuki Watanabe
/Senior Software Engineer
Databricks