In this free three-part training series, we’ll explore how Databricks lets data scientists and ML engineers quickly move from experimentation to production-scale machine learning model deployments — all on the same platform. In this series, we’ll work with a single data set throughout the lifecycle as well as scikit-learn, MLflow and Apache SparkTM on Databricks. The notebooks and data set will be provided so you can follow along and practice at your own pace.
Joshua Cook
Senior curriculum engineer, Databricks
Doug Bateman
Master instructor, Databricks
In part 1, we’ll use scikit-learn on Databricks to explore a sample subset of the data using core statistical and data science principles for exploratory analysis.
Regarder | Download notebook | DémarrerIn part 2, we’ll use a larger subset of the data and insights gathered in Part 1 to design an MLflow experiment to identify the best machine learning model for deployment.
Regarder | Download notebook | DémarrerIn part 3, we’ll see how to use MLflow and Apache Spark to train and deploy a large-scale machine learning model using the entire data set.
Regarder | Download notebook | DémarrerIf you want to continue to practice on your own, don't miss our Big Book of Data Science Use Cases. This how-to reference guide provides everything you need — including code samples and notebooks — so you can start getting your hands dirty putting the Databricks platform to work.
TÉLÉCHARGER MAINTENANTA collection of data science and machine learning talks from leading industry experts from Atlassian, Zynga, Starbucks, and more.