Skip to main content
Mihir Patel

Mihir Patel

Mihir Patel's posts

Hero image with abstract drawing representing a sparse matrix on a dark background

Mosaic Research

July 1, 2024/Less than a minute

Training MoEs at Scale with PyTorch and Databricks

Building-Custom-LLMs-with-Mosaic-AI-Training

Mosaic Research

May 14, 2024/9 min read

Building DBRX-class Custom LLMs with Mosaic AI Training

Abstract image of a stream of data

Mosaic Research

April 9, 2024/4 min read

Bringing MegaBlocks to Databricks

graphic

Mosaic Research

March 21, 2024/4 min read

Turbocharged Training: Optimizing the Databricks Mosaic AI Stack With FP8

How We Trained Stable Diffusion for Less than $50k (Part 3)

Mosaic Research

April 28, 2023/6 min read

How We Trained Stable Diffusion for Less than $50k (Part 3)

Training Stable Diffusion from Scratch for <$50k with MosaicML (Part 2)

Mosaic Research

April 26, 2023/6 min read

Training Stable Diffusion from Scratch for <$50k with MosaicML (Part 2)

Mosaic Research

June 23, 2022/5 min read

Farewell, CUDA OOM: Automatic Gradient Accumulation