Session

Generating Laughter: Testing and Evaluating the Success of LLMs for Comedy

Overview

ExperienceIn Person
TypeLightning Talk
TrackArtificial Intelligence
IndustryEducation, Media and Entertainment
TechnologiesAI/BI, Databricks SQL, PyTorch
Skill LevelBeginner
Duration20 min

Nondeterministic AI models, like large language models (LLMs), offer immense creative potential but require new approaches to testing and scalability. Drawing from her experience running New York Times-featured Generative AI comedy shows, Erin uncovers how traditional benchmarks may fall short and how embracing unpredictability can lead to innovative, laugh-inducing results.

 

This talk will explore methods like multi-tiered feedback loops, chaos testing and exploratory user testing, where AI outputs are evaluated not by rigid accuracy standards but by their adaptability and resonance across different contexts — from comedy generation to functional applications. Erin will emphasize the importance of establishing a root source of truth — a reliable dataset or core principle — to manage consistency while embracing creativity.

 

Whether you’re looking to generate a few laughs of your own or explore creative uses of Generative AI, this talk will inspire and delight enthusiasts of all levels.

Session Speakers

IMAGE COMING SOON

Erin Staples

/Sr. Developer Experience Engineer
Galileo