Skip to main content
Page 1
Industries category icon 1

PySpark in 2023: A Year in Review

With the releases of Apache Spark 3.4 and 3.5 in 2023, we focused heavily on improving PySpark performance, flexibility, and ease of use...
Engineering blog

Simplify PySpark testing with DataFrame equality functions

The DataFrame equality test functions were introduced in Apache Spark™ 3.5 and Databricks Runtime 14.2 to simplify PySpark unit testing. The full set...
Engineering blog

Making Spark Accessible: My Databricks Summer Internship

September 26, 2023 by Amanda Liu in Engineering Blog
My summer internship on the PySpark team was a whirlwind of exciting events. The PySpark team develops the Python APIs of the open...
Engineering blog

Introducing English as the New Programming Language for Apache Spark

Introduction We are thrilled to unveil the English SDK for Apache Spark, a transformative tool designed to enrich your Spark experience. Apache Spark™...