Skip to main content
Page 1
Generative AI

Introducing Llama2-70B-Chat with MosaicML Inference

Llama2-70B-Chat is a leading AI model for text completion, comparable with ChatGPT in terms of quality. Today, organizations can leverage this state-of-the-art model...
Generative AI

Announcing MPT-7B-8K: 8K Context Length for Document Understanding

July 18, 2023 by Sam Havens and Erica Ji Yuen in Mosaic Research
Today, we are releasing MPT-7B-8K, a 7B parameter open-source LLM with 8k context length trained with the MosaicML platform. MPT-7B-8K was pretrained starting...
Generative AI

How We Trained Stable Diffusion for Less than $50k (Part 3)

In our previous blog post, we showed how we used the MosaicML platform, Streaming datasets, and the Composer library to train a Stable...
Generative AI

Training Stable Diffusion from Scratch for <$50k with MosaicML (Part 2)

We've replicated Stable Diffusion 2 for less than $50k, and we've open-sourced the training code so you can too! This is a 3x...
Generative AI

MosaicBERT: Pretraining BERT from Scratch for $20

With the MosaicBERT architecture + training recipe, you can now pretrain a competitive BERT-Base model from scratch on the MosaicML platform for $20...
Generative AI

MosaicML StreamingDataset: Fast, Accurate Streaming of Training Data from Cloud Storage

Loading your training data becomes an escalating challenge as datasets grow bigger in size and the number of nodes scales. We built StreamingDataset...
Generative AI

5x Faster Image Segmentation Training with MosaicML Recipes

Can't stop, won't stop. Earlier this year, we shared a new baseline for semantic segmentation (basically, classifying an image at the pixel level)...
Generative AI

MosaicML Delivers Leading NLP Performance in MLPerf v2.1

MosaicML leads the MLPerf NLP results, delivering a score of 7.9 minutes on 8x NVIDIA A100 GPUs in the Open Division, thanks to...
Generative AI

Farewell, CUDA OOM: Automatic Gradient Accumulation

June 23, 2022 by Mihir Patel and Erica Ji Yuen in Mosaic Research
With automatic gradient accumulation, Composer lets users seamlessly change GPU types and number of GPUs without having to worry about batch size. CUDA...
Generative AI

Blazingly Fast Computer Vision Training with the Mosaic ResNet and Composer

Match benchmark accuracy on ImageNet (He et al., 2015) in 27 minutes, a 7x speedup (ResNet-50 on 8xA100s). Reach higher levels of accuracy...