Sponsored by: Galileo Technologies Inc. | Taming Rogue AI Agents with Observability-Driven Evaluation

Overview
Thursday
June 12
1:40 pm
| Experience | In Person |
|---|---|
| Type | Breakout |
| Track | Artificial Intelligence |
| Industry | Enterprise Technology, Media and Entertainment, Financial Services |
| Technologies | AI/BI |
| Skill Level | Intermediate |
| Duration | 40 min |
LLM agents often drift into failure when prompts, retrieval, external data, and policies interact in unpredictable ways. This technical session introduces a repeatable, metric-driven framework for detecting, diagnosing, and correcting these undesirable behaviors in agentic systems at production scale. We demonstrate how to instrument the agent loop with fine-grained signals—tool-selection quality, error rates, action progression, latency, and domain-specific metrics—and send them into an evaluation layer (e.g. Galileo). This telemetry enables a virtuous cycle of system improvement. We present a practical example of a stock-trading system and show how brittle retrieval and faulty business logic cause undesirable behavior. We refactor prompts, adjust the retrieval pipeline—verifying recovery through improved metrics. Attendees will learn how to: add observability with minimal code change, pinpoint root causes via tracing, and drive continuous, metric-validated improvement.
Session Speakers
IMAGE COMING SOON
Atindriyo Sanyal
/Co-founder
Galileo