HomepageData + AI Summit 2023 Logo
SAN FRANCISCO, JUNE 26-29
VIRTUAL, JUNE 28-29
  • Sessions
Watch on demand

Building a Real-Time Model Monitoring Pipeline on Databricks

Tuesday, June 27 @4:00 PM
Attending in person? Add to your schedule ↗

Overview

Model deployment is almost never the final step in any ML lifecycle. ML models can degrade over time due to a variety of influencing factors. In this technical deep dive, we will build a real-time ML model monitoring pipeline on Databricks. We need to own and monitor our models for drifts like feature drift, concept drift, distribution drift, and so on. We must constantly monitor the models and issue alerts or trigger retraining when necessary. With so many open source tools and frameworks available, it can be difficult to figure out how to make everything work. In this tutorial, we will create a high-quality model monitoring pipeline. Everything will be built from the ground up using Apache Spark™ on Databricks.



 



In this session, we will introduce a use case in which we set up a model serving pipeline and log the predictions to a stream in real time. We will then configure a model metric monitoring pipeline to consume from the stream and aggregate over specific time windows. Then, to see these metrics live on dashboards, we will integrate a model monitoring visualizing pipeline.


Type

  • Breakout

Experience

  • In Person

Track

  • DSML: Production ML / MLOps, Databricks Experience (DBX)

Industry

  • Enterprise Technology, Travel and Hospitality

Difficulty

  • Intermediate

Duration

  • 40 min

Session Speakers

Headshot of Alon Gubkin

Alon Gubkin

Co-Founder & CTO

Aporia

Headshot of Anindya Saha

Anindya Saha

ML Platform Software Engineer

Don't miss this year's event!

Register now