Skip to main content

Mosaic AI Foundation Model Serving

Two ways to purchase

Pricing

Access and query state-of-the-art open foundation models and use them to quickly and easily build applications that leverage a high-quality generative AI model without maintaining your own model deployment.

Select plan

help me choose

Select cloud

Select model

Select
Loading...

Foundation Model On Demand Prices

ModelPrice / 1M INPUT tokens
(Effective US East Price)
Price / 1M OUTPUT tokens
(Effective US East Price)
DBU / 1M tokens across all regions
DBRX$2.25$6.7532.143 (input) / 96.429 (output)
LLaMa-2-70B$2.00$2.0028.571
Mixtral-8-7B$1.50$1.5021.429
MPT-30B$1.00$1.0014.286
LLaMa-2-13B$0.95$0.9513.571
MPT-7B$0.50$0.507.143
BGE-L$0.10$0.101.429

Foundation Model Provisioned Throughput Prices

ModelPrice/unit/hour1
(Effective US East Price)
DBU/hour across all regions
DBRX$14.85212.143
70B Models$11.00157.143
30B Models$7.84112.000
13B Models$5.5078.571
Mixtral 8 7B$20.36290.857
7B Models$1.4020.000

1 - Minimum units and configurations vary by cloud

On Demand (Per-Token) Pricing Examples

ModelInput tokensOutput tokensRegionServing
Price/DBU
Total Price
DBRX4,000,0001,000,000US East$0.070$15.75
LLaMa-2-70B4,000,0001,000,000US East$0.070$10.00
Mixtral-8-7B4,000,0001,000,000AP (Sydney)$0.088$9.43

Provisioned Throughput Pricing Examples

Modelhours/moRegionServing
Price/DBU
Monthly Price*
DBRX720US East$0.070$10,692
LLaMa-2-70B720US East$0.070$7,920
Mixtral-8-7B720AP (Sydney)$0.088$18,429

* Per throughput band

Provisioned Throughput Estimated Token Capacity

ModelMaximum tokens/sec for typical workloads per throughput band
DBRX600
LLaMa-2-70B635
Mixtral-8-7B1,700
MPT-30B450
LLaMa-2-13B980
MPT-7B2,450

Pay as you go with a 14-day free trial or contact us for committed-use discounts or custom requirements.

Mosaic AI Model Serving FAQ

Our regional prices are based on the regional cost of infrastructure supporting our serverless products.