Session

Sponsored by: Snorkel AI | Evaluating and Improving Performance of Agentic Systems

Overview

ExperienceIn Person
TypeLightning Talk
TrackArtificial Intelligence
IndustryHealth and Life Sciences, Manufacturing, Financial Services
TechnologiesMLFlow, Llama
Skill LevelIntermediate

GenAI systems are evolving beyond basic information retrieval and question answering, becoming sophisticated agents capable of managing multi-turn dialogues and executing complex, multi-step tasks autonomously. However, reliably evaluating and systematically improving their performance remains challenging. In this session, we'll explore methods for assessing the behavior of LLM-driven agentic systems, highlighting techniques and showcasing actionable insights to identify performance bottlenecks and to creating better-aligned, more reliable agentic AI systems.