GPUs power today’s most advanced AI workloads—from forecasting and recommendations to multimodal foundation models. However, teams struggle with procuring and managing GPU infrastructure, configuring distributed training environments, and debugging data loading bottlenecks. Deep learning researchers prefer to focus on the modeling, not troubleshooting infrastructure.
We’re excited to announce the Public Preview of AI Runtime (AIR), a new training stack that enables on-demand distributed GPU training on A10s and H100s. AI Runtime contains all the technology used for large scale training of LLMs such as MPT and DBRX. Even in Beta, several hundreds of customers, including Rivian, Factset, and YipitData have used AIR to train and ship deep learning models into production. Use cases span the gamut from computer vision models to recommendation systems to finetuned LLMs for agentic tasks. Our own Databricks AI Research team used AIR for reinforcement learning of models such as in our recent KARL paper.
With AI Runtime, Databricks users now have:

For interactive development and debugging, connect to on-demand A10s and H100s in Databricks Notebooks with just a few clicks. From there, leverage all the developer ergonomics that Databricks is known for, from environment management for common Python packages to agent-powered authoring and debugging with Genie Code. Easily mount data from the Lakehouse to train deep learning models, or even invoke a fleet of remote CPUs for Spark data processing workloads from your GPU-powered notebook to prepare your data.

Use Genie Code to help resolve performance bottlenecks, experiment with new architectures, or debug tricky bugs around model convergence or cryptic framework errors.
AI Runtime is a production-grade platform for accelerated computing. Develop your deep learning code in interactive notebooks, and then use the full power of Lakeflow to submit and orchestrate jobs on GPU compute. Both notebooks and custom code repositories can be executed by Lakeflow for long-running or scheduled jobs. For production needs such as CI/CD (continuous integration and continuous deployment), AI Runtime is fully compatible with our Declarative Automation Bundles (DABs).
With our Lakeflow integration, customers can keep model training and fine-tuning tightly synchronized with upstream data pipelines and downstream production systems.
“Databricks' AI Runtime greatly streamlined the process of training a custom Text To Formula (TTF) model. With no infrastructure setup or delays, it was easy to choose the right compute based on prompt size and output token generation. This allowed us to move quickly, maintain our Lakehouse workflows, and deliver a high-quality model with full governance, reducing time to setup, train and deploy our model from days to hours.”— Nikhil Sunderraj Principal Machine Learning Engineer, FactSet Research Systems, Inc.

Distributed training workloads can be painful to prepare, debug, and observe. From troubleshooting RDMA setups to tracking telemetry from multiple GPUs to proper software configuration, users can easily miss critical details that dramatically slow model training.
Instead, AI Runtime is optimized for the entire deep learning lifecycle—and is designed to save you time. Key dependencies like PyTorch and CUDA come pre-installed, along with optimized support for distributed training frameworks such as Ray, Hugging Face Transformers, Composer, and other libraries, so you can start training immediately without managing environments. Customers are also welcome to bring their own libraries, from Unsloth to TorchRec to custom training loops.

Integrated SDKs and observability tools simplify the management of distributed training workloads. MLFlow enables deep observability of GPU workloads, with automatic tracking of GPU utilization and training experiments. Whether you're fine-tuning foundation models or training forecasting and personalization models, the runtime is optimized to accelerate training workflows with minimal setup.

Today’s Public Preview of AI Runtime supports distributed training across 8x H100s in a single-node, with multi-node support currently in Private Preview.
"Databricks' AI Runtime enables us to efficiently run LLM workloads (fine tuning and inference) without infrastructure overhead, directly in our lakehouse. This seamless integration simplifies our pipelines and provides efficient use of GPUs, enabling us to deliver high quality AI insights to our customers and focus on innovation, not on infrastructure."— Lucas Froguel, Senior AI Platform Engineer, YipitData
AI Runtime integrates natively with the Databricks Lakehouse, enabling you to run and govern GPU workloads where your data resides. This eliminates fragmented workflows and simplifies the path from experimentation to production.
Your AI workloads run fully within your enterprise data perimeter, delivering strong governance and security without sacrificing flexibility for experimentation and scale.
"Leveraging Databricks' serverless GPU support within our Lakehouse enables us to efficiently train advanced audio and multimodal models without infrastructure overhead. This seamless integration simplifies workflows and provides efficient use of GPU resources, ensuring we deliver high-performance systems and focus on innovation."— Arjuna Siva, VP of Infotainment & Connectivity, Rivian and Volkswagen Group Technologies
Demand for accelerated compute continues to grow across AI workloads and agentic systems. AI Runtime enables more Databricks customers to leverage NVIDIA hardware to accelerate their AI workloads and drive their business forward. We are excited to continue partnering with NVIDIA to bring the latest NVIDIA technology, like the RTX PRO 4500 Blackwell Server Edition, announced at GTC 2026 to our customers.
"As AI adoption accelerates across industries, organizations need scalable, high-performance infrastructure to power their data and AI workloads. NVIDIA technologies bring accelerated performance to the AI Runtime offering for the Databricks Lakehouse Platform."— Pat Lee, Vice President, Strategic Partnerships at NVIDIA.
To help you get started, we’ve put together several template notebooks and starter guides:
Please reach out to your account team to learn more or if you have any questions!