Generating Laughter: Testing and Evaluating the Success of LLMs for Comedy
Overview
Experience | In Person |
---|---|
Type | Lightning Talk |
Track | Artificial Intelligence |
Industry | Education, Media and Entertainment |
Technologies | AI/BI, Databricks SQL, PyTorch |
Skill Level | Beginner |
Duration | 20 min |
Nondeterministic AI models, like large language models (LLMs), offer immense creative potential but require new approaches to testing and scalability. Drawing from her experience running New York Times-featured Generative AI comedy shows, Erin uncovers how traditional benchmarks may fall short and how embracing unpredictability can lead to innovative, laugh-inducing results.
This talk will explore methods like multi-tiered feedback loops, chaos testing and exploratory user testing, where AI outputs are evaluated not by rigid accuracy standards but by their adaptability and resonance across different contexts — from comedy generation to functional applications. Erin will emphasize the importance of establishing a root source of truth — a reliable dataset or core principle — to manage consistency while embracing creativity.
Whether you’re looking to generate a few laughs of your own or explore creative uses of Generative AI, this talk will inspire and delight enthusiasts of all levels.
Session Speakers
IMAGE COMING SOON
Erin Staples
/Sr. Developer Experience Engineer
Galileo