Session

What’s New in Apache Spark™ 4.0?

Register or Login

Overview

Thursday

June 12

12:30 pm

ExperienceIn Person
TypeBreakout
TrackData Engineering and Streaming
IndustryEnterprise Technology
TechnologiesApache Spark
Skill LevelIntermediate
Duration40 min

Join this session for a concise tour of Apache Spark™ 4.0’s most notable enhancements:

  • SQL features: ANSI by default, scripting, SQL pipe syntax, SQL UDF, session variable, view schema evolution, etc.
  • Data type: VARIANT type, string collation
  • Python features: Python data source, plotting API, etc.
  • Streaming improvements: State store data source, state store checkpoint v2, arbitrary state v2, etc.
  • Spark Connect improvements: More API coverage, thin client, unified Scala interface, etc.
  • Infrastructure: Better error message, structured logging, new Java/Scala version support, etc.

 

Whether you’re a seasoned Spark user or new to the ecosystem, this talk will prepare you to leverage Spark 4.0’s latest innovations for modern data and AI pipelines.

Session Speakers

Daniel Tenedorio

/Sr. Staff Software Engineer
Databricks

Wenchen Fan

/Senior Staff Software Engineer
Databricks