Build Data Pipelines with Lakeflow Declarative Pipelines
Overview
Monday
June 09
2:00 pm
Experience | In Person |
---|---|
Type | Paid Training |
Duration | 240 min |
In this course, you’ll learn how to define and schedule data pipelines that incrementally ingest and process data through multiple tables on the Data Intelligence Platform, using Lakeflow Declarative Pipelines in Spark SQL and Python. We’ll cover topics like how to get started with Lakeflow Declarative Pipelines, how Lakeflow Declarative Pipelines tracks data dependencies in data pipelines, how to configure and run data pipelines using the Lakeflow Declarative Pipelines. UI, how to use Python or Spark SQL to define data pipelines that ingest and process data through multiple tables on the Data Intelligence Platform, using Auto Loader and Lakeflow Declarative Pipelines, how to use APPLY CHANGES INTO syntax to process Change Data Capture feeds, and how to review event logs and data artifacts created by pipelines and troubleshoot syntax.By streamlining and automating reliable data ingestion and transformation workflows, this course equips you with the foundational data engineering skills needed to help kickstart AI use cases. Whether you're preparing high-quality training data or enabling real-time AI-driven insights, this course is a key step in advancing your AI journey.Pre-requisites: Beginner familiarity with the Databricks Data Intelligence Platform (selecting clusters, navigating the Workspace, executing notebooks), cloud computing concepts (virtual machines, object storage, etc.), production experience working with data warehouses and data lakes, intermediate experience with basic SQL concepts (select, filter, groupby, join, etc), beginner programming experience with Python (syntax, conditions, loops, functions), beginner programming experience with the Spark DataFrame API (Configure DataFrameReader and DataFrameWriter to read and write data, Express query transformations using DataFrame methods and Column expressions, etc.)Labs: NoCertification Path: Databricks Certified Data Engineer Associate