Mosaic AI Foundation Model Serving
Two ways to purchase
![](/en-website-assets/static/04df36feaaab180c807d8017a913c6a8/illustration_nodes_color_4_01666985242.png)
Access and query state-of-the-art open foundation models and use them to quickly and easily build applications that leverage a high-quality generative AI model without maintaining your own model deployment.
Foundation Model Serving DBU rates and Throughput
Model | Pay-Per-Token | Provisioned Throughput1 | ||
---|---|---|---|---|
DBU / 1M INPUT tokens (Global) | DBU / 1M OUTPUT tokens (Global) | DBU / hour (Global) | Throughput Band2 (max tokens / sec)3 | |
Llama 3.1 405B | 142.857 | 428.571 | 700.000 | 2,200 |
Llama 3.1 70B | 14.286 | 42.857 | 424.286 | 6,000 |
Llama 3.1 8B | n/a | n/a | 106.000 | 12,000 |
DBRX | 10.714 | 32.143 | 171.429 | 650 |
Llama 3 70B | 14.286 | 42.857 | 212.143 | 1,000 |
Llama 3 8B | n/a | n/a | 106.000 | 3,000 |
Llama 2 70B | 7.143 | 21.429 | 157.143 | 600 |
Llama 2 13B | n/a | n/a | 78.571 | 450 |
Mixtral 8x7B | 7.143 | 14.286 | 157.143 | 5,000 |
MPT 30B | 14.286 | 14.286 | 112.000 | 635 |
MPT 7B | 7.143 | 7.143 | 20.000 | 2,450 |
GTE | 1.857 | 1.857 | n/a | n/a |
BGE Large | 1.429 | 1.429 | 10.480 | n/a |
1: Throughput shown is an example based on a typical real-time use case with input / output of 3500 / 300 tokens. Actual throughput will vary, depending on the use case, query shape and other factors.
2: Throughput band is a model-specific maximum throughput (tokens per second) provided at the above per-hour price. With Provisioned Throughput Serving, model throughput is provided in increments of its specific "throughput band"; higher model throughput will require the customer to set an appropriate multiple of the throughput band which is then charged at the multiple of the per-hour price above.
3: Shown for serving on Azure. Some numbers are different on AWS when charged at a different price.
Pay-Per-Token Serving Pricing Examples
Model | Input tokens | Output tokens | Region | Unit price $ / DBU | Total Price |
---|---|---|---|---|---|
Llama 3.1 405B | 4,000,000 | 1,000,000 | US East | $0.070 | $70.00 |
Llama 3.1 70B | 4,000,000 | 1,000,000 | US East | $0.070 | $7.00 |
DBRX | 4,000,000 | 1,000,000 | Europe (Ireland) | $0.077 | $5.78 |
Mixtral 8x7B | 4,000,000 | 1,000,000 | AP (Sydney) | $0.088 | $3.77 |
Provisioned Throughput Serving Pricing Examples
Model | Throughput bands | Hours / month | Region | Unit price $ / DBU | Monthly Price |
---|---|---|---|---|---|
Llama 3.1 405B | 1 | 720 | US East | $0.070 | $35,280 |
Llama 3.1 70B | 1 | 720 | US East | $0.070 | $21,384 |
DBRX | 1 | 720 | US East | $0.070 | $8,640 |
Mixtral 8x7B | 2 | 720 | Europe (Ireland) | $0.077 | $17,424 |
Llama 3.1 8B | 4 | 720 | AP (Sydney) | $0.088 | $26,865 |
Pay as you go with a 14-day free trial or contact us for committed-use discounts or custom requirements.