Session

Building Production-Ready Agents with a Self-Evolving Test Suite

Register or Login

Overview

ExperienceIn Person
TrackArtificial Intelligence & Agents
IndustryEnterprise Technology
TechnologiesDatabricks Agents
Skill LevelIntermediate
Building agents is challenging. Most teams iterate on agent quality by vibe-checking. Try a question, fix what looks wrong, ship. It works at first and breaks down fast. Each fix can revive an old bug. Coding agents make this worse by rewriting the agent in seconds while verification stays manual.This session introduces a new best practice. With one MLflow command, your coding agent turns into a rigorous builder that grows a test suite from inside the vibe-check loop. Every piece of feedback on a bad answer becomes an automated test. Every coding-agent fix runs against the accumulated suite, scored and recorded in MLflow. By the time your agent is ready to ship, you have a real test suite that grew alongside the build.We walk through this on stage end to end. We start with a prototype agent that has no tracing and no eval, and we end with a production-ready agent backed by a test suite that grew alongside the build.

Session Speakers

Speaker placeholderIMAGE COMING SOON

Adam Gurary

/Senior Product Manager
Databricks

Speaker placeholderIMAGE COMING SOON

Yuki Watanabe

/Senior Software Engineer
Databricks