Coding with AI Agents to Supercharge Your Spark UDFs with GPU-Acceleration
Overview
| Experience | In Person |
|---|---|
| Track | Artificial Intelligence & Agents |
| Industry | Enterprise Technology |
| Technologies | Databricks SQL |
| Skill Level | Beginner |
NVIDIA cuDF enables GPU-accelerated query execution for Apache Spark workloads. However, a common performance bottleneck for vectorized query engines are User-Defined Functions (UDFs). Since UDFs contain opaque custom code, the engine must fall back to row-at-a-time execution.
We present a new set of agents skills in Project Aether to convert Spark UDFs into GPU-compatible implementations: either native SQL expressions or GPU UDFs invoking cuDF or CUDA primitives. The agent reasons through tasks in an end-to-end workflow: 1) generating CPU-behavior tests, 2) producing a GPU implementation, 3) running benchmarks, and 4) profiling and optimizing performance, with correctness verified through iterative feedback loops.
By transforming UDFs to run on GPUs, agents enable full query acceleration for the cuDF plugin, RAPIDS Accelerator for Apache Spark, unlocking significant 20x+ UDF speedups.
In this session we will go into the practical learnings from building an AI agent for Spark.
Session Speakers
Felix Cheung
/Product
NVIDIA
Rishi Chandra
/Systems Software Engineer
NVIDIA