Skip to main content
Page 1
Engineering blog

Announcing the State Reader API: The New "Statestore" Data Source

Databricks Runtime 14.3 includes a new capability that allows users to access and analyze Structured Streaming 's internal state data: the State Reader...
Engineering blog

Introducing Apache Spark™ 3.5

Today, we are happy to announce the availability of Apache Spark™ 3.5 on Databricks as part of Databricks Runtime 14.0. We extend our...
Engineering blog

Multiple Stateful Operators in Structured Streaming

August 7, 2023 by Angela Chu and Jungtaek Lim in Engineering Blog
In the world of data engineering, there are operations that have been used since the birth of ETL. You filter. You join. You...
Engineering blog

Adaptive Query Execution in Structured Streaming

In Databricks Runtime, Adaptive Query Execution (AQE) is a performance feature that continuously re-optimizes batch queries using runtime statistics during query execution. Starting...
Engineering blog

Introducing Apache Spark™ 3.4 for Databricks Runtime 13.0

Today, we are happy to announce the availability of Apache Spark™ 3.4 on Databricks as part of Databricks Runtime 13.0 . We extend...
Engineering blog

Python Arbitrary Stateful Processing in Structured Streaming

October 18, 2022 by Hyukjin Kwon and Jungtaek Lim in Engineering Blog
More and more customers are using Databricks for their real-time analytics and machine learning workloads to meet the ever increasing demand of their...
Engineering blog

Native Support of Session Window in Spark Structured Streaming

Apache Spark™ Structured Streaming allowed users to do aggregations on windows over event-time . Before Apache Spark 3.2™, Spark supported tumbling windows and...