Session

Building Production-Ready Agents with a Self-Evolving Test Suite

Overview

Experience	In Person
Track	Artificial Intelligence & Agents
Industry	Enterprise Technology
Technologies	Databricks Agents
Skill Level	Intermediate

Building agents is challenging. Most teams iterate on agent quality by vibe-checking. Try a question, fix what looks wrong, ship. It works at first and breaks down fast. Each fix can revive an old bug. Coding agents make this worse by rewriting the agent in seconds while verification stays manual.This session introduces a new best practice. With one MLflow command, your coding agent turns into a rigorous builder that grows a test suite from inside the vibe-check loop. Every piece of feedback on a bad answer becomes an automated test. Every coding-agent fix runs against the accumulated suite, scored and recorded in MLflow. By the time your agent is ready to ship, you have a real test suite that grew alongside the build.We walk through this on stage end to end. We start with a prototype agent that has no tracing and no eval, and we end with a production-ready agent backed by a test suite that grew alongside the build.

Session Speakers

IMAGE COMING SOON

Adam Gurary

/Senior Product Manager
Databricks

IMAGE COMING SOON

Yuki Watanabe

/Senior Software Engineer
Databricks