SESSION

Scaling Video Ad Classification Across Millions of Classes with GenAI

OVERVIEW

EXPERIENCEIn Person
TYPEBreakout
TRACKGenerative AI
INDUSTRYMedia and Entertainment
TECHNOLOGIESAI/Machine Learning, GenAI/LLMs, MLFlow
SKILL LEVELIntermediate
DURATION40

Vivvix, an ad-intel company, leverages real-time insights from diverse creatives (videos, audio) leveraging ML and GenAI. Our initial goal aimed to categorize video ads into 30,000 product classes, with a planned expansion to six million. While an initial transformer-based machine learning model achieved high accuracy, we anticipated challenges from exponential growth in training time and limited data per class. To address these issues, we leveraged OSS LLMs. We used optimized LLama2(Vllm) to categorize creatives by identifying product categories and conducting similarity searches across the labels. Our baseline machine learning model achieved 69% accuracy with ~ 25,000 labels and a training dataset of ~200k creatives. By integrating LLMs, we achieved a remarkable 15% uplift in accuracy. Combining both approaches, we devised a solution where the LLM model acts as a pre-processing step, generating summaries for subsequent machine learning analysis.

SESSION SPEAKERS

Puneet Jain

/Senior Specialist solutions Architect
Databricks

Dong-Hwi Kim

/Senior Machine Learning Engineer
Vivvix