Practical ML | Training Series
Building and deploying machine learning models

In this free three-part training series, we’ll explore how Databricks lets data scientists and ML engineers quickly move from experimentation to production-scale machine learning model deployments — all on the same platform. In this series, we’ll work with a single data set throughout the lifecycle as well as scikit-learn, MLflow and Apache SparkTM on Databricks. The notebooks and data set will be provided so you can follow along and practice at your own pace.

Joshua Cook

Senior curriculum engineer, Databricks

Doug Bateman

Master instructor, Databricks

Part 1: Exploratory Data Analysis

In part 1, we’ll use scikit-learn on Databricks to explore a sample subset of the data using core statistical and data science principles for exploratory analysis.

Regarder | Download notebook | Démarrer

Part 2: Experimentation and Modelling

In part 2, we’ll use a larger subset of the data and insights gathered in Part 1 to design an MLflow experiment to identify the best machine learning model for deployment.

Regarder | Download notebook | Démarrer

Part 3: Productionization at Scale

In part 3, we’ll see how to use MLflow and Apache Spark to train and deploy a large-scale machine learning model using the entire data set.

Regarder | Download notebook | Démarrer

What's Next?

If you want to continue to practice on your own, don't miss our Big Book of Data Science Use Cases. This how-to reference guide provides everything you need — including code samples and notebooks — so you can start getting your hands dirty putting the Databricks platform to work.


Recommended Spark + AI Summit Tech Talks

A collection of data science and machine learning talks from leading industry experts from Atlassian, Zynga, Starbucks, and more.

flèche précédente
Session 1
Automatic Forecasting using Prophet, Databricks, Delta Lake and MLflow at Atlassian
Session 2
Productionizing Deep Reinforcement Learning with Spark and MLflow at Zynga
Session 3
Translating Models to Medicine an Example of Managing Visual Communications at Seattle Children's
Session 4
Operationalizing Machine Learning at Scale at Starbucks
Session 5
Generative Hyperloop Design: Managing Massively Scaled Simulations Focused on Quick-Insight Analytics and Demand Modelling
Session 6
Patterns and Anti-Patterns for Memorializing Data Science Project Artifacts at BlueCross BlueShield
Session 7
Saving Energy in Homes with a Unified Approach to Data and AI at Quby
Session 8
Machine Learning Data Lineage with MLflow and Delta Lake
flèche suivante

Prêt à démarrer ?

Essai gratuitNous contacter