HomepageData + AI Summit 2022 Logo
Watch on demand

Power to the (SQL) People: Python UDFs in DBSQL

On Demand

Type

  • Session

Format

  • In-Person

Track

  • Data Lakes, Data Warehouses and Data Lakehouses

Difficulty

  • Intermediate

Room

  • Moscone South | Level 3 | 314

Duration

  • 35 min
Download session slides

Overview

Databricks SQL (DB SQL) allows customers to leverage the simple and powerful Lakehouse architecture with up to 12x better price/performance compared to traditional cloud data warehouses. Analysts can use standard SQL to easily query data and share insights using a query editor, dashboards or a BI tool of their choice, and analytics engineers can build and maintain efficient data pipelines, including with tools like dbt.

While SQL is great at querying and transforming data, sometimes you need to extend its capabilities with the power of Python, a full programming language. Users of Databricks notebooks already enjoy seamlessly mixing SQL, Python and several other programming languages. Use cases include masking or encrypting and decrypting sensitive data, complex transformation logic, using popular open source libraries or simply reusing code that has already been written elsewhere in Databricks. In many cases, it is simply prohibitive or even impossible to rewrite the logic in SQL.

Up to now, there was no way to use Python from within DBSQL. We are removing this restriction with the introduction of Python User Defined Functions (UDFs). DBSQL users can now create, manage and use Python UDFs using standard SQL. UDFs are registered in Unity Catalog, which means they can be governed and used throughout Databricks, including in notebooks.

Session Speakers

Stefania Leone

Sr. Manager Product Management

Databricks

Martin Grund

Senior Staff Software Engineer

Databricks

See the best of Data+AI Summit

Watch on demand