Deep Learning with Databricks
Description
This course begins by covering the basics of neural networks and the tensorflow.keras API. We will focus on how to leverage Spark to scale our models, including distributed training, hyperparameter tuning, and inference, while leveraging MLflow to track, version, and manage these models. We will deep dive into distributed deep learning, including hands-on examples to compare and contrast various techniques for distributed data preparation, including Petastorm and TFRecord, as well as distributed training techniques such as Horovod and spark-tensorflow-distributor. To better understand the model’s predictions, you will apply model interpretability libraries. Further, you will learn the concepts behind Convolutional Neural Networks (CNNs) and transfer learning, and apply them to solve image classification tasks. We will wrap up the course by covering Recurrent Neural Networks (RNNs) and attention-based models for natural language processing (NLP) applications.
Duration
2 full days or 4 half days
Objectives
Build deep learning models using tensorflow.keras
Tune hyperparameters at scale with Hyperopt and Spark
Track, version, and manage experiments using MLflow
Perform distributed inference at scale using pandas UDFs
Scale and train distributed deep learning models using Horovod
Apply model interpretability libraries, such as SHAP, to understand model predictions
Use CNNs and transfer learning for image classification tasks
Use RNNs, attention-based models, and transfer learning for NLP tasks
Prerequisites
Intermediate experience with Python and pandas (or completion of Introduction to Python for Data Science & Data Engineering)
Familiarity with Apache Spark (or completion of Apache Spark Programming)
Working knowledge of machine learning and data science (or completion of Scalable Machine Learning with Apache Spark)
Logistics
Zoom is our chosen online platform to deliver classes. Ensure you can access Zoom by clicking here.
Some classes may also leverage Slack for classroom communication. Please test Slack by clicking here. If you are having difficulty connecting to Slack, please disconnect from your VPN.
If your company laptop has firewall restrictions, we recommend that you use a personal laptop for the training.
Please have one of these supported browsers installed.
Outline
Day 1
Neural network and tf.keras fundamentals
Improve models by adding data standardization, callbacks, checkpointing, etc.
Track and version models with MLflow
Distributed inference with pandas UDFs
Distributed hyperparameter tuning with Hyperopt
Large scale data preparation with Petastorm
Day 2
Distributed model training with Horovod and Petastorm
Model interpretability with SHAP
CNNs for image classification and transfer learning
Distributed training with TFRecord using spark-tensorflow-distributor
Deploy REST endpoint using MLflow Model Serving on Databricks
Textual embeddings, RNNs, attention-based models, and transfer learning for named entity recognition (NER)
Upcoming Public Classes
Public Class Registration
If your company has purchased success credits or has a learning subscription, please fill out the public training requests form. Otherwise, you can register below.
Private Class Delivery
If your organization would like to request a private delivery of the course, please fill out the request form below.
Questions?
If you have any questions, please refer to our Frequently Asked Questions page.