Skip to main content

Structured Streaming

Structured Streaming is a high-level API for stream processing that became production-ready in Spark 2.2. Structured Streaming allows you to take the same operations that you perform in batch mode using Spark’s structured APIs, and run them in a streaming fashion. This can reduce latency and allow for incremental processing. The best thing about Structured Streaming is that it allows you to rapidly and quickly get value out of streaming systems with virtually no code changes. It also makes it easy to reason about because you can write your batch job as a way to prototype it and then you can convert it to a streaming job. The way all of this works is by incrementally processing that data.

structured streaming

Here’s more to explore

big book of data engineering thumbnail

Big Book of Data Engineering: 2nd Edition

The latest technical guidance for building real-time data pipelines.

Download now
Lakehouse — the Evolution of Data Management

The Data Lakehouse Platform for Dummies

Learn why the lakehouse is the best platform for all your data and AI.

Get the eBook
Intro to Databricks Lakehouse

Lakehouse Fundamentals

Get up to speed on Lakehouse by taking this free on-demand training.

Start now
Back to Glossary