Session
Pitch Data to Player Insights: How the Cleveland Guardians Use AI and Real-Time Analytics at Scale
Overview
| Experience | In Person |
|---|---|
| Track | Data Engineering & Streaming |
| Industry | Communications, Media & Entertainment |
| Technologies | AI/BI, Lakeflow |
| Skill Level | Intermediate |
The Cleveland Guardians run a real-time data and AI platform that ingests live game feeds, scores every pitch with neural network models, and delivers AI-powered scouting insights to analysts through natural language tools.Behind the scenes, this system supports high-stakes, in-game decisions. Built on Databricks, it combines streaming ingestion, ETL pipelines, model serving, and live dashboards. The challenge was not just building it, but deploying it across dev, QA, and production without slowing innovation.In this session, we will share how we standardized and scaled the platform using Databricks Asset Bundles. By consolidating assets into a version-controlled monorepo, we created a unified deployment framework for modular jobs, shared clusters, coordinated builds, and post-deployment orchestration.You will leave with a practical blueprint for operationalizing a modern data and generative AI platform, plus candid lessons on what worked and what we would refine.
Session Speakers
Henry Qin
/Director of Platform/Data Engineering
Cleveland Guardians
Michael Halverson
/Senior Data Engineer
Cleveland Guardians