Session

Measure What Matters: Quality-Focused Monitoring for Production AI Agents

Overview

ExperienceIn Person
TypeBreakout
TrackArtificial Intelligence
IndustryEnterprise Technology
TechnologiesMLFlow, Mosaic AI
Skill LevelIntermediate
Duration40 min

Ensuring the operational excellence of AI agents in production requires robust monitoring capabilities that span both performance metrics and quality evaluation. This session explores Databricks' comprehensive Mosaic Agent Monitoring solution, designed to provide visibility into deployed AI agents through an intuitive dashboard that tracks critical operational metrics and quality indicators.

 

We'll demonstrate how to use the Agent Monitoring solution to iteratively improve a production agent that delivers a better customer support experience while decreasing the cost of delivering customer support.

 

We will show how to:

  • Identify and proactively fix a quality problem with the GenAI agent’s response before it becomes a major issue.
  • Understand user’s usage patterns and implement/test an feature improvement to the GenAI agent

 

Key session takeaways include:

  • Techniques for monitoring essential operational metrics, including request volume, latency, errors, and cost efficiency across your AI agent deployments
  • Strategies for implementing continuous quality evaluation using AI judges that assess correctness, guideline adherence, and safety without requiring ground truth labels
  • Best practices for setting up effective monitoring dashboards that enable dimension-based analysis across time periods, user feedback, and topic categories
  • Methods for collecting and integrating end-user feedback to create a closed-loop system that drives iterative improvement of your AI agents

Session Speakers

IMAGE COMING SOON

Niall Turbitt

/Sr Staff Data Scientist
Databricks

IMAGE COMING SOON

Eric Peter

/Product - AI Platform
Databricks