SESSION

Databricks Streaming and Delta Live Tables

OVERVIEW

EXPERIENCEIn Person
TYPEPaid Training
TRACKPaid Training
DURATION240 min
  • Audience: Data engineers
  • Hands-on labs: Yes
  • Certification path: Databricks Certified Data Engineer Professional
  • Description: In this half-day course, you’ll learn how to Incrementally process data to power analytic insights with Structured Streaming and Auto Loader, and how to apply design patterns for designing workloads to perform ETL in the Lakehouse with Delta Live Tables. First, we’ll cover topics including ingesting raw streaming data, enforcing data quality, implementing CDC, and exploring and tuning state information. Then, we’ll cover options to perform a streaming read on a source, requirements for end-to-end fault tolerance, options to perform a streaming write to a sink, and creating an aggregation and watermark on a streaming dataset.
  • Pre-requisites: Ability to perform basic code development tasks using the Databricks workspace (create clusters, run code in notebooks, use basic notebook operations, import repos from git, etc.), intermediate programming experience with PySpark (extract data from a variety of file formats and data sources, apply a number of common transformations to clean data, reshape and manipulate complex data using advanced built-in functions), intermediate programming experience with Delta Lake (create tables, perform complete and incremental updates, compact files, restore previous versions etc.)