Buliding ETL Pipelines with SQL
Overview
| Experience | In Person |
|---|---|
| Track | Paid Training |
This hands-on course teaches you how to build production-ready ETL pipelines using pure SQL on the Databricks Data Intelligence Platform. You'll learn the declarative building blocks — Streaming Tables, Materialized Views, and AUTO CDC — that replace complex procedural ingestion and transformation logic with concise, incremental, and fully managed pipeline definitions. Following a realistic retail dataset through the medallion architecture, you'll incrementally ingest files with Auto Loader, build pre-computed Silver-to-Gold transformations with incremental refresh, manage SCD Type 1 and Type 2 dimensions with AUTO CDC, and orchestrate the full pipeline using Lakeflow Jobs with SQL File tasks and DAG-based workflows.
Prerequisites
- Navigating the Databricks workspace — sidebar, Catalog Explorer, and SQL Editor
- Unity Catalog basics — catalogs, schemas, tables, and volumes
- Intermediate SQL — SELECT, JOIN, GROUP BY, CAST, COALESCE, CREATE TABLE, and INSERT
- Data warehousing concepts — fact tables, dimension tables, star schemas, and the medallion architecture
- Basic understanding of ETL — Extract, Transform, Load workflows and why incremental processing matters
Note: Hands-on training courses will be updated to reflect the newest product and feature announcements from Data + AI Summit in June 2026.