Skip to main content
OpenAI

Reserved Capacity

This offering is available to Enterprise customers. Please contact our sales team⁠ to learn more.

Reserved instances are designed for cutting-edge customers running larger workloads, allowing inference at scale with full control over the model configuration and performance profile.

Today, reserved instances allow inference at scale:

  • Reserved instances are a static allocation of capacity dedicated to you, providing a predictable environment that you control.
  • You are able to monitor your specific instances with the same tools and dashboards OpenAI uses to build on our own models and optimize shared capacity models.
  • You can realize all the throughput, latency, and cost benefits from optimizing your specific workload (for example—caching and latency/throughput tradeoffs).
  • You choose when to update the snapshot of your model, deciding if and whether to use the latest models. If you want to change your model, just let our support team know.
  • Other models are available upon request.

Reserved instances offer SLAs for instance uptime and on-call engineering support:

  • 99.5% uptime commitment
  • On-call engineering support for reserved instance customers

Reserved instance rentals are based on reserved compute units with 3-month or 1-year (~15% savings) commitments. Running an individual model instance (see below for current SKUs) requires a specific number of compute units:

3-month commit1-year commit (~15% savings)
Monthly costTotal commitMonthly costTotal commit
Price / Unit$260$780$220$2,640
3-month commit per instance*1 year commit per instance
Model instanceMinimum # InstancesUnits / InstanceMonthly costTotal commitMonthly costTotal commit
GPT-4.1
128k max context
3400$104,000$312,000$88,000$1,056,000
GPT-4.1 mini
128k max context
3200$52,000$156,000$44,000$528,000
GPT-4.1 nano
128k max context
3100$26,000$78,000$22,000$264,000
GPT-4o mini
128k max context
3100$26,000$78,000$22,000$264,000
GPT-4o
128k max context
3500$130,000$390,000$110,000$1,320,000
o1-2024-12-17
200k max context
3800$208,000$624,000$176,000$2,112,000
o3-mini-2025-01-313150$39,000$117,000$33,000$396,000
GPT-4 Turbo
128k max context
2300$78,000$234,000$66,000$792,000
GPT-4 (0613)
8k max context
2300$78,000$234,000$66,000$792,000
GPT-3.5 Turbo
16k max context
1100$26,000$78,000$22,000$264,000

FAQ

No: we'll never use or train on your data. Reserved Capacity adheres to our Enterprise Privacy commitments.

**_What happens when the pay-as-you-go price of a model changes?_**Reserved Capacity pricing is based on raw compute costs. Sometimes price changes are enabled by improved price efficiency, in which case updating to that model will yield more throughput on your reserved instances. Sometimes pay-as-you-go price changes are made for reasons separate from model efficiency in which case your reserved capacity throughput will be unaffected.