SESSION

State Reader API: the New "Statestore" Data Source

OVERVIEW

EXPERIENCEIn Person
TYPELightning Talk
TRACKData Engineering and Streaming
INDUSTRYEnergy and Utilities, Health and Life Sciences, Media and Entertainment
TECHNOLOGIESApache Spark, Developer Experience
SKILL LEVELAdvanced
DURATION20 min

Databricks added a new capability that allows users to access and analyze Structured Streaming's internal state data: the State Reader API. The State Reader API differentiates itself from well-known Spark data formats such as JSON, CSV, Avro, and Protobuf. Its primary purpose is facilitating the development, debugging, and troubleshooting of stateful Structured Streaming workloads. Apache Spark™ 4.0.0 – expected to be released later this year – will include the State Reader API. This talk will revisit stateful operator basics, explain common pains with state data, and show how the new State Reader API helps.

SESSION SPEAKERS

Craig Lukasik

/Sr. SSA
Databricks