Skip to main content
Page 1
Platform blog

Unity Catalog Lakeguard: Industry-first and only data governance for multi-user Apache™ Spark clusters

We are thrilled to announce Unity Catalog Lakeguard , which allows you to run Apache Spark™ workloads in SQL, Python, and Scala with...
Engineering blog

Introducing Apache Spark™ 3.5

Today, we are happy to announce the availability of Apache Spark™ 3.5 on Databricks as part of Databricks Runtime 14.0. We extend our...
Engineering blog

Shared Clusters in Unity Catalog for the win: Introducing Cluster Libraries, Python UDFs, Scala, Machine Learning and more

We are thrilled to announce that you can run even more workloads on Databricks’ highly efficient multi-user clusters thanks to new security and...
Engineering blog

Spark Connect Available in Apache Spark 3.4

Last year Spark Connect was introduced at the Data and AI Summit. As part of the recently released Apache SparkTM 3.4, Spark Connect...
Engineering blog

Introducing Apache Spark™ 3.4 for Databricks Runtime 13.0

Today, we are happy to announce the availability of Apache Spark™ 3.4 on Databricks as part of Databricks Runtime 13.0 . We extend...
Platform blog

Power to the SQL People: Introducing Python UDFs in Databricks SQL

We were thrilled to announce the preview for Python User-Defined Functions (UDFs) in Databricks SQL (DBSQL) at last month's Data and AI Summit...
Engineering blog

Introducing Spark Connect - The Power of Apache Spark, Everywhere

At last week's Data and AI Summit, we highlighted a new project called Spark Connect in the opening keynote. This blog post walks...
Engineering blog

Adaptive Query Execution: Speeding Up Spark SQL at Runtime

Read Rise of the Data Lakehouse to explore why lakehouses are the data architecture of the future with the father of the data...
Engineering blog

Introducing New Built-in and Higher-Order Functions for Complex Data Types in Apache Spark 2.4

Try this notebook in Databricks Apache Spark 2.4 introduces 29 new built-in functions for manipulating complex types (for example, array type), including higher-order...
Company blog

Working with Nested Data Using Higher Order Functions in SQL on Databricks

View this notebook on Databricks Nested data types offer Databricks customers and Apache Spark users powerful ways to manipulate structured data. In particular...
Engineering blog

SQL Subqueries in Apache Spark 2.0

Try this notebook in Databricks In the upcoming Apache Spark 2.0 release, we have substantially expanded the SQL standard capabilities. In this brief...