Data Modeling Strategies

This course walks practitioners through the full spectrum of data modelling approaches on the Databricks Data Intelligence Platform - from classical data warehouse techniques (Inmon, Kimball, Data Vault 2.0), through Feature Store-driven ML use cases, to productising data via Data Products on Unity Catalog.

Each modelling approach is introduced with a lecture, then reinforced with a hands-on demo against a shared dataset (TPC-H samples). The course finishes with a comprehensive end-to-end lab that exercises ERM, dimensional modelling, Data Vault 2.0, and the Feature Store in a single integrated workflow.

Note:

1. The course includes practice labs that the learners should perform after going through the entire course.

2. For SCORM lecture files, please ensure that you close the SCORM window after completing the content. Do not click the ‘Next Lesson’ button, as doing so may prevent the SCORM module from being marked as complete.

Languages Available: English | 日本語 | Português BR | 한국어

Skill Level

Associate

Duration

Prerequisites

In this course, the content was developed for participants with these skills/knowledge/abilities:

• Working knowledge of SQL and relational database concepts

• Familiarity with Databricks fundamentals (workspaces, notebooks, Unity Catalog basics)

• Conceptual understanding of OLTP vs OLAP and the medallion architecture

• Basic exposure to Python and PySpark is helpful but not required

• Awareness of dimensional modelling concepts is helpful but not required

Self-Paced

Custom-fit learning paths for data, analytics, and AI roles and career paths through on-demand videos

Customer registration Partner registration

See all our registration options

Registration options

Databricks has a delivery method for wherever you are on your learning journey

Self-Paced

Custom-fit learning paths for data, analytics, and AI roles and career paths through on-demand videos

Instructor-Led

Public and private courses taught by expert instructors across half-day to two-day courses

Blended Learning

Self-paced and weekly instructor-led sessions for every style of learner to optimize course completion and knowledge retention. Go to Subscriptions Catalog tab to purchase

Purchase now

Skills@Scale

Comprehensive training offering for large scale customers that includes learning elements for every style of learning. Inquire with your account executive for details

Upcoming Public Classes

AI/BI for Data Analysts - Mandarin Chinese

本课程面向数据分析师，讲授如何在 Databricks 中设计、构建、发布和运维 AI/BI Dashboards。AI/BI Dashboards 将受 Unity Catalog 治理的数据与交互式可视化、筛选器和 Genie 集成相结合，使业务用户无需编写代码即可探索答案。

本课程围绕一个端到端构建项目展开。您将从 Unity Catalog 中的源表开始，最终完成一个已发布、受监控的多页面仪表盘。在此过程中，您将了解仪表盘如何融入更广泛的 Databricks AI/BI 产品系列，以及 Genie、数据集、可视化和筛选器在工作流中的各自作用。

课程内容包括：

• AI/BI Dashboard 基础知识，以及它与 Genie 和 Databricks 平台其他部分的关系。

• 探索 Unity Catalog 中的源数据，并使用 SQL 设计可复用的仪表盘数据集。

• 创建可视化（KPI、趋势和细分），并设计简洁的多页面仪表盘布局。

• 使用 Genie Code，根据自然语言提示词起草 SQL、图表和筛选器。

• 添加筛选器，使仪表盘能够进行交互并响应查看者的问题。

• 发布和共享仪表盘并管理权限，确保适当的人员可以查看和编辑仪表盘。

• 通过定时刷新、缓存和使用情况监控，在生产环境中运行仪表盘。

注意：对于 SCORM 讲授文件，请确保在完成内容后关闭 SCORM 窗口。请勿点击“Next Lesson”按钮，否则可能导致 SCORM 模块无法被标记为已完成。

Automated Deployment with Declarative Automation Bundles

This course provides a comprehensive review of DevOps principles and their application to Databricks projects. It begins with an overview of core DevOps, DataOps, continuous integration (CI), continuous deployment (CD), and testing, and explores how these principles can be applied to data engineering pipelines.

The course then focuses on continuous deployment within the CI/CD process, examining tools like the Databricks REST API, SDK, and CLI for project deployment. You will learn about Declarative Automation Bundles (DABs) and how they fit into the CI/CD process. You’ll dive into their key components, folder structure, and how they streamline deployment across various target environments in Databricks. You will also learn how to add variables, modify, validate, deploy, and execute Declarative Automation Bundles for multiple environments with different configurations using the Databricks CLI.

Finally, the course introduces Visual Studio Code as an Interactive Development Environment (IDE) for building, testing, and deploying Declarative Automation Bundles locally, optimizing your development process. The course concludes with an introduction to automating deployment pipelines using GitHub Actions to enhance the CI/CD workflow with Declarative Automation Bundles.

By the end of this course, you will be equipped to automate Databricks project deployments with Declarative Automation Bundles, improving efficiency through DevOps practices.

Note:

1. Databricks Academy is transitioning from video lectures to a more streamlined PDF format with slides and notes for all self-paced courses. Please note that demo videos will still be available in their original format. We would love to hear your thoughts on this change, so please share your feedback through the course survey at the end. Thank you for being a part of our learning community!

2. This course is the fourth in the 'Advanced Data Engineering with Databricks' series.

Paid & Subscription

Lab

Professional

Data Warehousing Practitioner

Get Started with Databricks for Data Warehousing

This course provides a comprehensive, hands-on overview of Databricks' modern approach to data warehousing, highlighting how a data lakehouse architecture combines the strengths of traditional data warehouses with the flexibility and scalability of the cloud. Working end-to-end with a fictional retailer (Vintage Audio Co.), you will explore the workspace, query governed Unity Catalog tables, keep Delta tables fast for analytical reads, load data incrementally with Auto Loader, model a star schema, and deliver consistent insights through Unity Catalog metric views — all from a single, governed environment, with AI-driven features (Predictive Optimization, Genie Code) built in throughout.

Paid & Subscription

Lab

Onboarding