Skip to main content
Page 1
Engineering blog

Introducing Python User-Defined Table Functions (UDTFs)

Apache Spark™ 3.5 and Databricks Runtime 14.0 have brought an exciting feature to the table: Python user-defined table functions (UDTFs). In this blog...
Engineering blog

Arrow-optimized Python UDFs in Apache Spark™ 3.5

In Apache Spark™, Python User-Defined Functions (UDFs) are among the most popular features. They empower users to craft custom code tailored to their...
Engineering blog

Introducing Apache Spark™ 3.5

Today, we are happy to announce the availability of Apache Spark™ 3.5 on Databricks as part of Databricks Runtime 14.0. We extend our...
Engineering blog

Spark Connect Available in Apache Spark 3.4

Last year Spark Connect was introduced at the Data and AI Summit. As part of the recently released Apache SparkTM 3.4, Spark Connect...
Engineering blog

Introducing Apache Spark™ 3.4 for Databricks Runtime 13.0

Today, we are happy to announce the availability of Apache Spark™ 3.4 on Databricks as part of Databricks Runtime 13.0 . We extend...
Engineering blog

Memory Profiling in PySpark

There are many factors in a PySpark program's performance. PySpark supports various profiling tools to expose tight loops of your program and allow...
Engineering blog

How to Profile PySpark

In Apache Spark™, declarative Python APIs are supported for big data workloads. They are powerful enough to handle most common use cases. Furthermore...