Skip to main content

Mosaic AI Foundation Model Serving

Two ways to purchase

Access and query state-of-the-art open foundation models and use them to quickly and easily build applications that leverage a high-quality generative AI model without maintaining your own model deployment.

Loading...

Foundation Model Serving DBU rates and Throughput

ModelPay-Per-TokenProvisioned Throughput1
DBU / 1M INPUT tokens
(Global)
DBU / 1M OUTPUT tokens
(Global)
DBU / hour
(Global)
Throughput Band2
(max tokens / sec)3
Llama 3.1 405B142.857428.571700.0002,200
Llama 3.1 70B14.28642.857424.2866,000
Llama 3.1 8Bn/an/a106.00012,000
DBRX10.714 32.143171.429650
Llama 3 70B 14.28642.857 212.143 1,000
Llama 3 8Bn/an/a106.0003,000
Llama 2 70B 7.143 21.429 157.143600
Llama 2 13Bn/an/a78.571450
Mixtral 8x7B 7.143 14.286157.143 5,000
MPT 30B 14.286 14.286 112.000 635
MPT 7B 7.143 7.143 20.000 2,450
GTE1.8571.857n/an/a
BGE Large 1.429 1.42910.480n/a

1: Throughput shown is an example based on a typical real-time use case with input / output of 3500 / 300 tokens. Actual throughput will vary, depending on the use case, query shape and other factors.

2: Throughput band is a model-specific maximum throughput (tokens per second) provided at the above per-hour price.  With Provisioned Throughput Serving, model throughput is provided in increments of its specific "throughput band"; higher model throughput will require the customer to set an appropriate multiple of the throughput band which is then charged at the multiple of the per-hour price above.

3: Shown for serving on Azure.  Some numbers are different on AWS when charged at a different price.

Pay-Per-Token Serving Pricing Examples

ModelInput tokensOutput tokensRegionUnit price
$ / DBU
Total Price
Llama 3.1 405B4,000,0001,000,000US East$0.070$70.00
Llama 3.1 70B4,000,0001,000,000US East$0.070$7.00
DBRX4,000,0001,000,000Europe (Ireland)$0.077$5.78
Mixtral 8x7B4,000,0001,000,000AP (Sydney)$0.088$3.77

Provisioned Throughput Serving Pricing Examples

ModelThroughput bandsHours / monthRegionUnit price
$ / DBU
Monthly Price
Llama 3.1 405B1720US East$0.070$35,280
Llama 3.1 70B1720US East$0.070$21,384
DBRX1720US East$0.070$8,640
Mixtral 8x7B2720Europe (Ireland)$0.077$17,424
Llama 3.1 8B4720AP (Sydney)$0.088$26,865

Pay as you go with a 14-day free trial or contact us for committed-use discounts or custom requirements.

Mosaic AI Foundation Model Serving FAQ