Apache Spark™ Programming with Databricks
This course serves as an appropriate entry point to learn Apache Spark Programming with Databricks.
Below, we describe each of the four, four-hour modules included in this course.
Introduction to Apache Spark
This course offers essential knowledge of Apache Spark, with a focus on its distributed architecture and practical applications for large-scale data processing. Participants will explore programming frameworks, learn the Spark DataFrame API, and develop skills for reading, writing, and transforming data using Python-based Spark workflows.
Developing Applications with Apache Spark
Master scalable data processing with Apache Spark in this hands-on course. Learn to build efficient ETL pipelines, perform advanced analytics, and optimize distributed data transformations using Spark’s DataFrame API. Explore grouping, aggregation, joins, set operations, and window functions. Work with complex data types like arrays, maps, and structs while applying best practices for performance optimization.
Stream Processing and Analysis with Apache Spark
Learn the essentials of stream processing and analysis with Apache Spark in this course. Gain a solid understanding of stream processing fundamentals and develop applications using the Spark Structured Streaming API. Explore advanced techniques such as stream aggregation and window analysis to process real-time data efficiently. This course equips you with the skills to create scalable and fault-tolerant streaming applications for dynamic data environments.
Monitoring and Optimizing Apache Spark Workloads on Databricks
This course explores the Lakehouse architecture and Medallion design for scalable data workflows, focusing on Unity Catalog for secure data governance, access control, and lineage tracking. The curriculum includes building reliable, ACID-compliant pipelines with Delta Lake. You'll examine Spark optimization techniques, such as partitioning, caching, and query tuning, and learn performance monitoring, troubleshooting, and best practices for efficient data engineering and analytics to address real-world challenges.
Prerequisites
- Basic programming knowledge
- Familiarity with Python
- Basic understanding of SQL queries (SELECT, JOIN, GROUP BY)
- Familiarity with data processing concepts
- No prior Spark or Databricks experience required
Outline
Introduction to Apache Spark
Spark Runtime Architecture
Exploring Apache Spark Architecture in Databbricks
Introduction to Spark DataFrames and SQL
Reading and Writing Data with DataFrames
Distributed System Programming Fundamentals
Basic ETL with the DataFrame API
Flight Data ETL with the DataFrame API
Analyzing Transaction Data with DataFrames
Developing Applications with Apache Spark
DataFrame API Basics
Demo: (Optional) Basic ETL with the DataFrame API
Grouping and Aggregating Data
Demo: Grouping and Aggregating Data
Lab: Grouping and Aggregating E-Commerce Data
Relational Operations
Demo: Data Relational Operations in Apache Spark
Working with Complex Data
Demo: Working with Complex Data Types in Apache Spark
Lab: Working with Complex Data Types in E-Commerce Data
Stream Processing and Analysis with Apache Spark
Introduction to Stream Processing
Spark Structured Streaming
Demo: Introduction to Spark Structured Streaming
Lab: Introduction to Spark Structured Streaming
Advanced Stream Processing and Analysis
Demo: Window Aggregation in Spark Structured Streaming
Lab: Window Aggregation in Spark Structured Streaming
Monitoring and Optimizing Apache Spark Workloads on Databricks
Apache Spark and Databricks
Using Apache Spark with Delta Lake
Demo: Introduction to Delta Lake
Lab: Introduction to Delta Lake
Optimizing Apache Spark
Demo: Optimizing Apache Spark
Lab: Optimizing Apache Spark
Upcoming Public Classes
Date | Time | Language | Price |
---|---|---|---|
Jun 02 - 03 | 09 AM - 05 PM (Europe/Paris) | English | $1500.00 |
Jun 02 - 03 | 09 AM - 05 PM (America/Los_Angeles) | English | $1500.00 |
Jun 23 - 26 | 02 PM - 06 PM (Europe/Paris) | English | $1500.00 |
Jun 23 - 26 | 02 PM - 06 PM (America/New_York) | English | $1500.00 |
Jul 01 - 02 | 09 AM - 05 PM (Europe/Paris) | English | $1500.00 |
Jul 01 - 02 | 09 AM - 05 PM (America/Los_Angeles) | English | $1500.00 |
Jul 07 - 08 | 09 AM - 05 PM (Asia/Tokyo) | Japanese | $1500.00 |
Jul 10 - 11 | 09 AM - 05 PM (Australia/Sydney) | English | $1500.00 |
Jul 21 - 24 | 11 AM - 03 PM (Asia/Singapore) | English | $1500.00 |
Jul 28 - 31 | 02 PM - 06 PM (Europe/Paris) | English | $1500.00 |
Jul 28 - 31 | 02 PM - 06 PM (America/New_York) | English | $1500.00 |
Aug 11 - 12 | 09 AM - 05 PM (Australia/Sydney) | English | $1500.00 |
Aug 11 - 12 | 09 AM - 05 PM (Europe/Paris) | English | $1500.00 |
Aug 11 - 12 | 09 AM - 05 PM (America/Los_Angeles) | English | $1500.00 |
Aug 18 - 21 | 11 AM - 03 PM (Asia/Singapore) | English | $1500.00 |
Aug 18 - 21 | 02 PM - 06 PM (Europe/Paris) | English | $1500.00 |
Aug 18 - 21 | 02 PM - 06 PM (America/New_York) | English | $1500.00 |
Sep 08 - 09 | 09 AM - 05 PM (Australia/Sydney) | English | $1500.00 |
Sep 08 - 09 | 09 AM - 05 PM (Europe/Paris) | English | $1500.00 |
Sep 08 - 09 | 09 AM - 05 PM (America/Los_Angeles) | English | $1500.00 |
Sep 15 - 18 | 11 AM - 03 PM (Asia/Singapore) | English | $1500.00 |
Sep 15 - 18 | 02 PM - 06 PM (Europe/Paris) | English | $1500.00 |
Sep 15 - 18 | 02 PM - 06 PM (America/New_York) | English | $1500.00 |
Oct 01 - 02 | 09 AM - 05 PM (America/Los_Angeles) | English | $1500.00 |
Oct 06 | 09 AM - 05 PM (Australia/Sydney) | English | $1500.00 |
Oct 06 - 07 | 09 AM - 05 PM (Europe/Paris) | English | $1500.00 |
Oct 20 - 23 | 02 PM - 06 PM (Europe/Paris) | English | $1500.00 |
Oct 20 - 23 | 02 PM - 06 PM (America/New_York) | English | $1500.00 |
Oct 27 - 30 | 11 AM - 03 PM (Asia/Singapore) | English | $1500.00 |
Public Class Registration
If your company has purchased success credits or has a learning subscription, please fill out the Training Request form. Otherwise, you can register below.
Private Class Request
If your company is interested in private training, please submit a request.
Registration options
Databricks has a delivery method for wherever you are on your learning journey
Self-Paced
Custom-fit learning paths for data, analytics, and AI roles and career paths through on-demand videos
Register nowInstructor-Led
Public and private courses taught by expert instructors across half-day to two-day courses
Register nowBlended Learning
Self-paced and weekly instructor-led sessions for every style of learner to optimize course completion and knowledge retention. Go to Subscriptions Catalog tab to purchase
Purchase nowSkills@Scale
Comprehensive training offering for large scale customers that includes learning elements for every style of learning. Inquire with your account executive for details