Data Governance at Scale

In this course, you will learn how to implement data governance at scale on Databricks using Unity Catalog, with a focus on attribute-based access control, observability, and federated sharing. You will configure ABAC with governed tags, migrate from legacy fine-grained controls, enable and use system tables for audit and cost monitoring, deploy Lakehouse Monitoring for data and model quality, interpret lineage for impact and compliance, and apply federated governance and Delta Sharing patterns for secure cross-cloud collaboration.

Skill Level

Associate

Duration

Prerequisites

Complete the following course before taking this course:

• Databricks Fundamentals (or equivalent introductory Databricks course)

In this course, the content was developed for participants with these skills/knowledge/abilities:

• Familiarity with the Databricks platform and basic workspace operations (creating and attaching clusters, running notebooks, managing basic job runs).

• Working knowledge of core data governance concepts such as access control, permissions, and security policies in a data platform.

• Intermediate SQL experience, including creating and managing tables, views, and functions, and granting/revoking privileges on database objects.

• Understanding of Unity Catalog’s basic object model (metastore, catalogs, schemas, tables, volumes, functions, models).

• Basic understanding of data lineage and how data moves between sources, transformations, and downstream analytics or ML assets.

• Familiarity with fine-grained security techniques like row-level filters and column masking, even if not yet implemented in Unity Catalog.

• Beginner-level knowledge of cloud concepts (compute, storage, identities/groups) on at least one major cloud provider.

• Basic awareness of metadata management and data discovery practices in modern data platforms.

Self-Paced

Custom-fit learning paths for data, analytics, and AI roles and career paths through on-demand videos

Customer registration Partner registration

See all our registration options

Registration options

Databricks has a delivery method for wherever you are on your learning journey

Self-Paced

Custom-fit learning paths for data, analytics, and AI roles and career paths through on-demand videos

Instructor-Led

Public and private courses taught by expert instructors across half-day to two-day courses

Blended Learning

Self-paced and weekly instructor-led sessions for every style of learner to optimize course completion and knowledge retention. Go to Subscriptions Catalog tab to purchase

Purchase now

Skills@Scale

Comprehensive training offering for large scale customers that includes learning elements for every style of learning. Inquire with your account executive for details

Upcoming Public Classes

Apache Spark Developer

Stream Processing and Analysis with Apache Spark™

Learn the essentials of stream processing and analysis with Apache Spark in this course. Gain a solid understanding of stream processing fundamentals and develop applications using the Spark Structured Streaming API. Explore advanced techniques such as stream aggregation and window analysis to process real-time data efficiently. This course equips you with the skills to create scalable and fault-tolerant streaming applications for dynamic data environments.

Note: Databricks Academy is transitioning from video lectures to a more streamlined PDF format with slides and notes for all self-paced courses. Please note that demo videos will still be available in their original format. We would love to hear your thoughts on this change, so please share your feedback through the course survey at the end. Thank you for being a part of our learning community!

Paid & Subscription

Lab

Associate

Apache Spark Developer

Developing Applications with Apache Spark™

Master scalable data processing with Apache Spark in this hands-on course. Learn to build efficient ETL pipelines, perform advanced analytics, and optimize distributed data transformations using Spark’s DataFrame API. Explore grouping, aggregation, joins, set operations, and window functions. Work with complex data types like arrays, maps, and structs while applying best practices for performance optimization.

Languages Available: English | 日本語 | 한국어

Paid & Subscription

Lab

Associate

Data Warehousing Practitioner

Delivery Specialization: CDW Migration Best Practices

This course is meant to provide guidance on best practices for executing a successful migration from Cloud Data Warehouses (CDWs) - e.g. Amazon Redshift, Snowflake, and Synapse Analytics - to Databricks. It covers a wide range of migration processes starting from initial discovery all the way through to delivery.