SESSION

Data Ingestion to Delta Lake

OVERVIEW

EXPERIENCEIn Person
TYPEPaid Training
TRACKPaid Training
DURATION240 min

 

  • Audience: Data engineers
  • Hands-on labs: Yes
  • Certification path: Databricks Certified Data Engineer Associate
  • Description: In this half-day course, you’ll learn how to ingest data into Delta Lake and manage that data. We’ll cover topics like Delta Lake features that make it the foundation for the data lakehouse architecture, how to use Delta Lake DDL to create tables, compact files, restore previous table versions, and perform garbage collection of tables, how to use CTAS to store data derived from a query in a Delta Lake table, and how to use SQL to perform complete and incremental updates to existing tables.Pre-requisites: Beginner familiarity with cloud computing concepts (virtual machines, object storage, etc.), production experience working with data warehouses and data lakes, intermediate experience with basic SQL concepts (select, filter, groupby, join, etc), beginner programming experience with Python (syntax, conditions, loops, functions), beginner programming experience with the Spark DataFrame API (Configure DataFrameReader and DataFrameWriter to read and write data, Express query transformations using DataFrame methods and Column expressions, etc.)