Session

Buliding ETL Pipelines with SQL

Overview

ExperienceIn Person
TrackPaid Training

This hands-on course teaches you how to build production-ready ETL pipelines using pure SQL on the Databricks Data Intelligence Platform. You'll learn the declarative building blocks — Streaming Tables, Materialized Views, and AUTO CDC — that replace complex procedural ingestion and transformation logic with concise, incremental, and fully managed pipeline definitions. Following a realistic retail dataset through the medallion architecture, you'll incrementally ingest files with Auto Loader, build pre-computed Silver-to-Gold transformations with incremental refresh, manage SCD Type 1 and Type 2 dimensions with AUTO CDC, and orchestrate the full pipeline using Lakeflow Jobs with SQL File tasks and DAG-based workflows.

Prerequisites

  • Navigating the Databricks workspace — sidebar, Catalog Explorer, and SQL Editor
  • Unity Catalog basics — catalogs, schemas, tables, and volumes
  • Intermediate SQL — SELECT, JOIN, GROUP BY, CAST, COALESCE, CREATE TABLE, and INSERT
  • Data warehousing concepts — fact tables, dimension tables, star schemas, and the medallion architecture
  • Basic understanding of ETL — Extract, Transform, Load workflows and why incremental processing matters  

Note: Hands-on training courses will be updated to reflect the newest product and feature announcements from Data + AI Summit in June 2026.