Skip to main content

Security & Trust Center

Your data security is our priority

 

 

Unified Security for Data and AI

What is Unity Catalog?

Databricks Unity Catalog offers a unified governance layer for your data and AI assets and is natively built into the Databricks Data Intelligence Platform. Unity Catalog enables seamless governance of structured and unstructured data, machine learning models, notebooks, dashboards, files, functions, and views across any cloud or platform. This simplifies collaboration for data scientists, analysts and engineers, leveraging AI to improve productivity and maximize the potential of the lakehouse architecture. The unified governance approach accelerates data and AI initiatives while streamlining regulatory compliance.

third-party tested ioactive

Key security concepts

While Unity Catalog provides many benefits to data and AI teams, we’ll focus on the security concepts here and direct you to the Unity Catalog product page and our docs (AWS, Azure and GCP) to learn more.

Secure all your data and AI assets in one place

Unity Catalog creates a common data governance model and simplifies governance for your data and ML models on any cloud. With Unity Catalog, you can define access policies once at the account level and enforce them across all workloads and workspaces. Unity Catalog also provides centralized fine-grained auditing by capturing an audit log of actions performed against the data and helps you meet your security, compliance and audit requirements.

unified security graphic

Move beyond file permissions

Organizations that adopt the lakehouse architecture do not need to duplicate basic cloud storage permissions and auditing across every tool that accesses their data, and they can move to permission structures relevant to the business, such as tables, views and models. Adopting this architecture allows these organizations to centrally define (and audit) access to data, flexibly restrict access using views and row filters and column masks, and understand sensitive data movement via data lineage.

Apart from the duplication across tools, why not grant access using the basic cloud storage permissions? The following are not possible with just S3, ADLS or GCS permissions but are possible with a proper governance engine like Unity Catalog:

  1. Views and fine-grained access control with row filters and column masks
  2. Centralized auditing and lineage, even at the column level, along with other Unity Catalog monitoring features
  3. End-user search and discovery of data
  4. Capturing rich table metadata and comments (informs AI assistants, among other benefits)
  5. A cohesive data lifecycle that allows users to reliably access data even when the data or data storage changes

We strongly recommend adopting fine-grained governance to simplify implementation of security best practices and adherence to new regulations and privacy requirements.

Important:

Never provide users or applications with storage-level access to Unity Catalog-managed tables or volumes. Doing so will bypass the security and governance controls you've put in place.

Manage fine-grained access controls with ease

Unity Catalog uses open ANSI SQL standard functions to define row filters and column masks, allowing fine-grained access controls on rows and columns. Unity Catalog further simplifies data stewardship with centralized and decentralized admin models (including hierarchical privileges) that allow security and data teams to assign the correct level of permissions for any given data or business team. Databricks also makes this work across programming languages and supports secure user isolation within our workloads.

architecture

Secure data sharing across organizations

Unity Catalog natively supports Delta Sharing, the world’s first open protocol for secure data sharing, enabling you to easily share existing data in Delta Lake and Apache Parquet formats to any computing platform. Consumers don’t have to be on the Databricks Platform, the same cloud or any cloud. You can share live data without replicating or copying it to another system. Native integrations with Power BI, Tableau, Apache Spark™, pandas and Java allow recipients to consume shared data directly from the tools of their choice. You can centrally manage, govern, audit and track shared data usage on one platform.

 

Track data movement

Data lineage is designed for comprehensive tracking and visualization of data as it moves through your jobs, dashboards or queries. For business users, it provides critical insights into the journey of data, from its origin to its final destination, facilitating better understanding and management of data workflows. For security users, it shows who is consuming, storing or transforming your most sensitive data. Providing detailed metadata about data transformations supports regulatory compliance efforts, making it easier to adhere to data privacy and security standards. This is crucial for industries with stringent data handling requirements.

 

Securely query across data sources

With Lakehouse Federation in Unity Catalog, you can use one permission model to set and apply access rules and safeguard all your data across data sources. By defining external data sources (databases and similar) in Unity Catalog, you can allow authorized users to query those sources without leaving the governance of Unity Catalog. Apply rules like row- and column-level security, tag-based policies and centralized auditing consistently across platforms, track data usage, and meet compliance requirements with built-in data lineage and auditability. And importantly, end users will no longer need to know or manage credentials for those external systems — a win for simplicity and security.

Learn more

We’ve focused here on the security-relevant features of Unity Catalog, but there are many other benefits, including Lakehouse Monitoring (data quality) and AI-generated documentation. Visit the Unity Catalog product page and our docs (AWS, Azure and GCP) to learn more.