Jason Lowe is a Distinguished Software Engineer at Nvidia. He is a PMC member of Apache Hadoop and Tez. Prior to Nvidia, he worked for Yahoo on the Big Data Platform team on Apache Hadoop and Tez. He has held positions of Architect for the embedded Linux OS group in Motorola and Principal Programmer at Volition Games. Jason holds a BS in Computer Science from the University of Illinois at Urbana-Champaign.
May 27, 2021 12:10 PM PT
The RAPIDS Accelerator for Apache Spark is a plugin that enables the power of GPUs to be leveraged in Spark DataFrame and SQL queries, improving the performance of ETL pipelines. User-defined functions (UDFs) in the query appear as opaque transforms and can prevent the RAPIDS Accelerator from processing some query operations on the GPU.
This presentation discusses how users can leverage the RAPIDS Accelerator UDF Compiler to automatically translate some simple UDFs to equivalent Catalyst operations that are processed on the GPU. The presentation also covers how users can provide a GPU version of Scala, Java, or Hive UDFs for maximum control and performance. Sample UDFs for each case will be shown along with how the query plans are impacted when the UDFs are processed on the GPU.
June 24, 2020 05:00 PM PT
Utilizing accelerators in Apache Spark presents opportunities for significant speedup of ETL, ML and DL applications. In this deep dive, we give an overview of accelerator aware task scheduling, columnar data processing support, fractional scheduling, and stage level resource scheduling and configuration. Furthermore, we dive into the Apache Spark 3.x RAPIDS plugin, which enables applications to take advantage of GPU acceleration with no code change. An explanation of how the Catalyst optimizer physical plan is modified for GPU aware scheduling is reviewed. The talk touches upon how the plugin can take advantage of the RAPIDS specific libraries, cudf, cuio and rmm to run tasks on the GPU. Optimizations were also made to the shuffle plugin to take advantage of the GPU using UCX, a unified communication framework that addresses GPU memory intra and inter node. A roadmap for further optimizations taking advantage of RDMA and GPU Direct Storage are mentioned. Industry standard benchmarks and runs on production datasets will be shared..