Data Management and Governance with Unity Catalog

**Important Notice:**: This course will be retired on December 12, 2025.
Databricks recommends completing it by December 11, 2025. Alternatively, based on your learning needs, you can enroll in one of the following replacement courses:

Get Started with Data Governance on Databricks – Learn foundational data governance concepts, including Unity Catalog and fine-grained access controls.
DevOps Essentials for Data Engineering – Continue your Data Engineering learning journey with essential DevOps principles and practices on Databricks.

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

In this Data Governance with Unity Catalog session, you'll learn concepts and perform labs that showcase workflows using Unity Catalog - Databricks’ solution to data governance. We'll start off with a brief introduction to Unity Catalog, discuss fundamental data governance concepts, and then dive into a variety of topics including using Unity Catalog for data access control, managing external storage and tables, data segregation, and more.

Languages Available: English | 日本語 | Português BR | 한국어

Skill Level

Associate

Duration

Prerequisites

- Familiarity with the Databricks Lakehouse completion (completion of the course Fundamentals of the Databricks Data Intelligence Platform V2)

- Basic knowledge of Python programming, jupyter notebook interface, and PySpark fundamentals.

- Familiarity with data governance topics

- Beginner familiarity with cloud computing concepts (virtual machines, object storage, etc.)

- Intermediate experience with basic SQL concepts such as SQL commands, aggregate functions, filters and sorting, indexes, tables, and views.

Self-Paced

Custom-fit learning paths for data, analytics, and AI roles and career paths through on-demand videos

Customer registration Partner registration

See all our registration options

Registration options

Databricks has a delivery method for wherever you are on your learning journey

Self-Paced

Custom-fit learning paths for data, analytics, and AI roles and career paths through on-demand videos

Instructor-Led

Public and private courses taught by expert instructors across half-day to two-day courses

Blended Learning

Self-paced and weekly instructor-led sessions for every style of learner to optimize course completion and knowledge retention. Go to Subscriptions Catalog tab to purchase

Purchase now

Skills@Scale

Comprehensive training offering for large scale customers that includes learning elements for every style of learning. Inquire with your account executive for details

Upcoming Public Classes

Model Development at Scale

In this course, you will develop an in-depth understanding of how to design, implement, and govern scalable machine learning systems that operate effectively at enterprise scale. The curriculum is organized into three experiential modules: developing distributed ML workflows with frameworks such as Apache SparkML and Ray, transitioning local ML development to distributed compute using tools like Pandas on Spark, and operationalizing and governing production models with Databricks’ MLOps ecosystem.

Through hands-on projects, you will construct end-to-end distributed ML pipelines using the SparkML workflow, applying Transformers, Estimators, and the fit/transform paradigm for both classification and regression tasks. You will version, compare, and manage experiments using MLflow 3.0 to ensure reproducibility and governance, capturing lineage between data, features, and model artifacts. Additionally, you will apply scalable Hyperparameter Optimization frameworks to improve model performance at scale.

The course concludes by demonstrating complete lifecycle management, from experimentation to production deployment, using Unity Catalog and Model Serving. You will learn to operationalize trained models, monitor their performance, and implement strong governance over models, features, and Delta assets within the Databricks environment.

Free

Professional

Platform Administrator

Get Started with Data Governance on Databricks - Japanese

このコースでは、ハンズオンデモとキャップストーンラボを使用して、Unity Catalog および Databricks でのきめ細かなアクセス制御について説明します。テーブルの種類、カタログとスキーマの構成、グループベースのアクセス管理、およびアクセス制御の移行戦略について学習します。このコースには、行レベルのセキュリティと列マスキングによるきめ細かなアクセス制御の適用、属性ベースのアクセス制御、制御の組み合わせ、制御の移行、および包括的なガバナンス実装のためのラボに関するデモが含まれています。

Free

instructor-led

Onboarding

Machine Learning Practitioner

Advanced Machine Learning Operations

In this course, you will be provided with a comprehensive understanding of the machine learning lifecycle and MLOps, emphasizing best practices for data and model management, testing, and scalable architectures. It covers key MLOps components, including CI/CD, pipeline management, and environment separation, while showcasing Databricks’ tools for automation and infrastructure management, such as Databricks Asset Bundles (DABs), Workflows, and Mosaic AI Model Serving. You will learn about monitoring, custom metrics, drift detection, model rollout strategies, A/B testing, and the principles of reliable MLOps systems, providing a holistic view of implementing and managing ML projects in Databricks.

Note:

1. This course is the second in the series of Advanced Machine Learning.

2. Databricks Academy is transitioning from video lectures to a more streamlined PDF format with slides and notes for all self-paced courses. Please note that demo videos will still be available in their original format. We would love to hear your thoughts on this change, so please share your feedback through the course survey at the end. Thank you for being a part of our learning community!

Free

Professional