Data Brew
Season 2, Episode 1

ML in Production

In the season opener, Matei Zaharia discusses how he entered the field of ML, best practices for productionizing ML pipelines, leveraging MLflow & the Lakehouse architecture for reproducible ML, and his current research in this field.

Listen to the audio

Back to all episodes

Guest


Matei Zaharia

Matei Zaharia is an Assistant Professor of Computer Science at Stanford University and Chief Technologist at Databricks. He started the Apache Spark project during his PhD at UC Berkeley in 2009, and has worked broadly in datacenter systems, co-starting the Apache Mesos project and contributing as a committer on Apache Hadoop. Today, Matei tech-leads the MLflow development effort at Databricks in addition to other aspects of the platform. Matei’s research work was recognized through the 2014 ACM Doctoral Dissertation Award for the best PhD dissertation in computer science, an NSF CAREER Award, and the US Presidential Early Career Award for Scientists and Engineers (PECASE).

Denny LeePlayPlay hover00:06

Welcome to season two of Data Brew by Databricks with Denny and Brooke. This season, we’re focusing on machine learning. The series allows us to explore various topics in the data and AI community. Whether we’re talking about data engineering or data science, we will interview subject matter experts to dive deeper into these topics. And while we’re at it, we’ll be enjoying our morning brew. My name is Denny Lee. I’m a developer advocate at Databricks, and one of the co-hosts of Data Brew.

Expand full transcript