Measure What Matters: Quality-Focused Monitoring for Production AI Agents
Overview
Experience | In Person |
---|---|
Type | Breakout |
Track | Artificial Intelligence |
Industry | Enterprise Technology |
Technologies | MLFlow, Mosaic AI |
Skill Level | Intermediate |
Duration | 40 min |
Ensuring the operational excellence of AI agents in production requires robust monitoring capabilities that span both performance metrics and quality evaluation. This session explores Databricks' comprehensive Mosaic Agent Monitoring solution, designed to provide visibility into deployed AI agents through an intuitive dashboard that tracks critical operational metrics and quality indicators.
We'll demonstrate how to use the Agent Monitoring solution to iteratively improve a production agent that delivers a better customer support experience while decreasing the cost of delivering customer support.
We will show how to:
- Identify and proactively fix a quality problem with the GenAI agent’s response before it becomes a major issue.
- Understand user’s usage patterns and implement/test an feature improvement to the GenAI agent
Key session takeaways include:
- Techniques for monitoring essential operational metrics, including request volume, latency, errors, and cost efficiency across your AI agent deployments
- Strategies for implementing continuous quality evaluation using AI judges that assess correctness, guideline adherence, and safety without requiring ground truth labels
- Best practices for setting up effective monitoring dashboards that enable dimension-based analysis across time periods, user feedback, and topic categories
- Methods for collecting and integrating end-user feedback to create a closed-loop system that drives iterative improvement of your AI agents
Session Speakers
IMAGE COMING SOON
Niall Turbitt
/Sr Staff Data Scientist
Databricks
IMAGE COMING SOON
Eric Peter
/Product - AI Platform
Databricks