Session
Unlocking Video Data at Scale: VLM Batch Inference with Ray on Databricks
Overview
| Experience | In Person |
|---|---|
| Track | Artificial Intelligence & Agents |
| Industry | Enterprise Technology, Communications, Media & Entertainment, Transportation |
| Technologies | Unity Catalog |
| Skill Level | Advanced |
Organisations sit on vast video archives (surveillance, manufacturing, inspections), yet lack scalable methods to extract insights. This session presents a production-ready architecture for distributed video analytics using Vision Language Models (VLMs) on Databricks.We'll walk through a three-stage accelerator: (1) video ingestion into Unity Catalog Volumes, (2) VLM registration with MLflow for reproducibility, and (3) distributed batch inference using Ray and VLLM with Qwen2.5-VL-32B. You'll see how Ray orchestrates GPU-accelerated inference across video datasets and how to extract structured entities from VLM outputs using Databricks AI Functions.Attendees will leave with:A reusable pattern for multi-modal video intelligence at scaleWorking code integrating Ray, VLLM, and Unity CatalogPrompt engineering techniques for video inputsCost and performance considerations for VLM workloadsWhether your use case is retail, manufacturing, or public safety—this pattern applies.
Session Speakers
Samantha Wise
/Senior Specialist Solutions Engineer
Databricks