Skip to main content

Databricks Certified Machine Learning Professional

Use Databricks Machine Learning and its capabilities to perform advanced machine learning in production tasks

Machine Learning Professional

Databricks Certified Machine Learning Professional

The Databricks Certified Machine Learning Professional certification exam assesses an individual’s ability to use Databricks Machine Learning and its capabilities to perform advanced machine learning in production tasks. This includes the ability to track, version, and manage machine learning experiments and manage the machine learning model lifecycle. In addition, the certification exam assesses the ability to implement strategies for deploying machine learning models. Finally, test-takers will also be assessed on their ability to build monitoring solutions to detect data drift. Individuals to pass this certification exam can be expected to perform advanced machine learning engineering tasks using Databricks Machine Learning.

Registration

In order to achieve this certification, earners must pass a certification exam. In order to achieve this certification, please either log in or create an account in our certification platform.

Learning Pathway

This certification is part of the Machine Learning learning pathway. Before attempting this certification, it is recommended that learners obtain the Machine Learning Associate certification.

Learning Path

Exam Details

Key details about the certification exam are provided below.

Minimally Qualified Candidate

The minimally qualified candidate should be able to:

  • Track, version, and manage machine learning experiments, including:

    • Data management with Delta Lake and Feature Store (creating and using tables)

    • Experiment tracking with MLflow (logging models and metrics, querying past runs, loading models)

    • Advanced experiment tracking (model signatures, input examples, nested runs, Databricks Autologging, hyperparameter tuning, artifact tracking)

  • Manage the machine learning model lifecycle, including:

    • Applying preprocessing logic in production environments (types of flavors, easing downstream use, saving/loading models)

    • Model management with MLflow Model Registry (capabilities, registering models, adding new model versions, transitioning model stages, deleting models and model versions)

    • Automate model management pipelines (implement Model Registry Webhooks, incorporate usage of Databricks Jobs)

  • Implement strategies for deploying machine learning models, including:

    • Batch (batch deployment options, scaling single-node models with Spark UDFs, optimizing written prediction tables, scoring using Feature Store tables)

    • Streaming (streaming deployment options, scaling single-node models in streaming pipelines)

    • Real-time (real-time deployment options, RESTful deployment with MLflow Model Serving, querying MLflow Model Serving models)

  • Build monitoring solutions for drift detection, including:

    • Types of drift (data drift, concept drift)

    • Drift tests and monitoring (numerical tests, categorical tests, input-label comparison tests)

    • Comprehensive drift solutions (drift monitoring architectures)

Duration

Testers will have 120 minutes to complete the certification exam.

Questions

There are 60 multiple-choice questions on the certification exam. The exact distribution of questions across high-level topics will be provided upon release of the certification exam.

Cost

Each attempt of the certification exam will cost the tester $200. Testers might be subjected to tax payments depending on their location. Testers are able to retake the exam as many times as they would like, but they will need to pay $200 for each attempt.

Test Aids

There are no test aids available during this exam.

Programming Language

All machine learning code within this exam will be in Python. In the case of workflows or code not specific to machine learning tasks, data manipulation code could be provided in SQL.

Expiration

Because of the speed at which the responsibilities of a machine learning practitioner and capabilities of the Databricks Lakehouse Platform change, this certification is valid for 2 years following the date on which each tester passes the certification exam.

Preparation

In order to learn the content assessed by the certification exam, candidates should take one of the following Databricks Academy courses:

Candidates are also able to learn more about the certification exam by taking the certification exam’s overview course (coming soon).

Frequently Asked Questions

In order to view answers to frequently asked questions (FAQs), please refer to Databricks Academy FAQ document.