entity.node.canonical

Agenda

Speakers

<nolink>

Event Info

EVENT ARCHIVE

Event Terms

Code of Conduct

Privacy Notice

Terms of Use

Modern Slavery Statement

California Privacy

Onsite experience

Travel

Event Archive

Experience

Consumer Industries

Cybersecurity

Energy and Utilities

Financial Services

Healthcare and Life Sciences

Manufacturing and Transportation

Marketing

Media and Entertainment

Public Sector

Startups

Tech and AI

Telecommunications

<none>

Industries & Solutions

Vibes to Validation: AI Eval for Partners

<ul>	<li><strong>Stop eyeballing Genie answers — start measuring them. </strong>In this hands-on session you'll attach a labeled evaluation set to a Genie Space, run the built-in benchmark, and read per-question scores to find out where it actually fails. Bring the Genie Space you built yesterday, or spin one up from a track.</li>	<li>You'll write 15 questions across five failure-mode categories, label acceptable answers, and run the benchmark to pinpoint your weakest category. Then change one line of instructions, re-run, and watch the score move — sometimes the wrong way, which is a real finding. You'll leave with a completed benchmark run and a repeatable method you can use on a customer's Genie Space next week.</li>	<li><em>Hands-on workshop - Bring a laptop - Free Edition account required - Lunch will be served.</em></li></ul>

Vibes to Validation: AI Eval for Partners

Overview