Session
Sponsored by: Snorkel AI | Evaluating and Improving Performance of Agentic Systems
Overview
Experience | In Person |
---|---|
Type | Lightning Talk |
Track | Artificial Intelligence |
Industry | Health and Life Sciences, Manufacturing, Financial Services |
Technologies | MLFlow, Llama |
Skill Level | Intermediate |
GenAI systems are evolving beyond basic information retrieval and question answering, becoming sophisticated agents capable of managing multi-turn dialogues and executing complex, multi-step tasks autonomously. However, reliably evaluating and systematically improving their performance remains challenging. In this session, we'll explore methods for assessing the behavior of LLM-driven agentic systems, highlighting techniques and showcasing actionable insights to identify performance bottlenecks and to creating better-aligned, more reliable agentic AI systems.