Pricing

Show prices per 1K tokens

Language models

Multiple models, each with different capabilities and price points. Prices can be viewed in units of either per 1M or 1K tokens. You can think of tokens as pieces of words, where 1,000 tokens is about 750 words. This paragraph is 35 tokens.

Language models are also available in the Batch API that returns completions within 24 hours for a 50% discount.

Learn about the Batch API

GPT-4 Turbo

With 128k context, fresher knowledge and the broadest set of capabilities, GPT-4 Turbo is more powerful than GPT-4 and offered at a lower price.

Learn about GPT-4 Turbo

Model	Input	Output
gpt-4-turbo-2024-04-09	$10.00 / 1M tokens	$30.00 / 1M tokens

Vision pricing calculator

GPT-4

With broad general knowledge and domain expertise, GPT-4 can follow complex instructions in natural language and solve difficult problems with accuracy.

Learn about GPT-4

Model	Input	Output
gpt-4	$30.00 / 1M tokens	$60.00 / 1M tokens
gpt-4-32k	$60.00 / 1M tokens	$120.00 / 1M tokens

GPT-3.5 Turbo

GPT-3.5 Turbo models are capable and cost-effective.

gpt-3.5-turbo-0125 is the flagship model of this family, supports a 16K context window and is optimized for dialog.

gpt-3.5-turbo-instruct is an Instruct model and only supports a 4K context window.

Learn about GPT-3.5 Turbo

Model	Input	Output
gpt-3.5-turbo-0125	$0.50 / 1M tokens	$1.50 / 1M tokens
gpt-3.5-turbo-instruct	$1.50 / 1M tokens	$2.00 / 1M tokens

Assistants API

The Assistants API and its tools make it easy for developers to build AI assistants in their applications. The tokens used for the Assistant API are billed at the chosen language model's per-token input / output rates. Additionally, we charge the following fees for tool usage:

Learn about Assistants API

Tool	Input
Code interpreter	$0.03 / session
File Search	$0.10 / GB of vector-storage per day (1 GB free)

GB refers to binary gigabytes (also known as gibibyte), where 1 GB is 2^30 bytes.

Fine-tuning models

Create your own custom models by fine-tuning our base models with your training data. Once you fine-tune a model, you’ll be billed only for the tokens you use in requests to that model.

Learn about fine-tuning

Model	Training	Input usage	Output usage
gpt-3.5-turbo	$8.00 / 1M tokens	$3.00 / 1M tokens	$6.00 / 1M tokens
davinci-002	$6.00 / 1M tokens	$12.00 / 1M tokens	$12.00 / 1M tokens
babbage-002	$0.40 / 1M tokens	$1.60 / 1M tokens	$1.60 / 1M tokens

Embedding models

Build advanced search, clustering, topic modeling, and classification functionality with our embeddings offering.

Learn about embeddings

Model	Usage
text-embedding-3-small	$0.02 / 1M tokens
text-embedding-3-large	$0.13 / 1M tokens
ada v2	$0.10 / 1M tokens

Base models

GPT base models are not optimized for instruction-following and are less capable, but they can be effective when fine-tuned for narrow tasks.

Learn about GPT base models

Model	Usage
davinci-002	$2.00 / 1M tokens
babbage-002	$0.40 / 1M tokens

Other models

Image models

Build DALL·E directly into your apps to generate and edit novel images and art. DALL·E 3 is the highest quality model and DALL·E 2 is optimized for lower cost.

Learn about image generation

Model	Quality	Resolution	Price
DALL·E 3	Standard	1024×1024	$0.040 / image
	Standard	1024×1792, 1792×1024	$0.080 / image
DALL·E 3	HD	1024×1024	$0.080 / image
	HD	1024×1792, 1792×1024	$0.120 / image
DALL·E 2		1024×1024	$0.020 / image
		512×512	$0.018 / image
		256×256	$0.016 / image

Audio models

Whisper can transcribe speech into text and translate many languages into English.

Text-to-speech (TTS) can convert text into spoken audio.

Model	Usage
Whisper	$0.006 / minute (rounded to the nearest second)
TTS	$15.00 / 1M characters
TTS HD	$30.00 / 1M characters

Please note that our Usage Policies require you to provide a clear disclosure to end users that the TTS voice they are hearing is AI-generated and not a human voice.

Older models

We continue to improve our models and periodically retire older, less used models.

View pricing info for other older models

Model	Input	Output
gpt-4-0125-preview	$10.00 / 1M tokens	$30.00 / 1M tokens
gpt-4-1106-preview	$10.00 / 1M tokens	$30.00 / 1M tokens
gpt-4-vision-preview	$10.00 / 1M tokens	$30.00 / 1M tokens
gpt-3.5-turbo-1106	$1.00 / 1M tokens	$2.00 / 1M tokens
gpt-3.5-turbo-0613	$1.50 / 1M tokens	$2.00 / 1M tokens
gpt-3.5-turbo-16k-0613	$3.00 / 1M tokens	$4.00 / 1M tokens
gpt-3.5-turbo-0301	$1.50 / 1M tokens	$2.00 / 1M tokens

Simple and flexible

Pay as you go
To keep things simple and flexible, pay only for the resources you use.

Choose your model
Use the right model for the job. We offer a spectrum of capabilities and price points.

Built with OpenAI

View all customer stories

FAQ

You can think of tokens as pieces of words used for natural language processing. For English text, 1 token is approximately 4 characters or 0.75 words. As a point of reference, the collected works of Shakespeare are about 900,000 words or 1.2M tokens.
To learn more about how tokens work and estimate your usage…
Experiment with our interactive Tokenizer tool.
Log in to your account and enter text into the Playground. The counter in the footer will display how many tokens are in your text.
We generally recommend that developers use either gpt-4 or gpt-3.5-turbo, depending on how complex the tasks you are using the models for are. gpt-4 generally performs better on a wide range of evaluations, while gpt-3.5-turbo returns outputs with lower latency and costs much less per token. We recommend experimenting with these models in Playground to investigate which models provide the best price performance trade-off for your usage. A common design pattern is to use several distinct query types which are each dispatched to the model appropriate to handle them.
You can set a monthly budget in your billing settings, after which we’ll stop serving your requests. There may be a delay in enforcing the limit, and you are responsible for any overage incurred. You can also configure an email notification threshold to receive an email alert once you cross that threshold each month. We recommend checking your usage tracking dashboard regularly to monitor your spend.
Chat completion requests are billed based on the number of input tokens sent plus the number of tokens in the output(s) returned by the API.
Your request may use up to num_tokens(input) + [max_tokens * max(n, best_of)] tokens, which will be billed at the per-engine rates outlined at the top of this page.
In the simplest case, if your prompt contains 200 tokens and you request a single 900 token completion from the gpt-3.5-turbo-1106 API, your request will use 1100 tokens and will cost [(200 * 0.001) + (900 * 0.002)] / 1000 = $0.002.
You can limit costs by reducing prompt length or maximum response length, limiting usage of best_of/n , adding appropriate stop sequences, or using engines with lower per-token costs.
There are two components to fine-tuning pricing: training and usage.
When training a fine-tuned model, the total tokens used will be billed according to our training rates. Note that the number of training tokens depends on the number of tokens in your training dataset and your chosen number of training epochs. The default number of epochs is 4.
(Tokens in your training file * Number of training epochs) = Total training tokens
Once you fine-tune a model, you’ll be billed only for the tokens you use. Requests sent to fine-tuned models are billed at our usage rates.

Price per 1K tokens (fixed)	$0.01
Resized width	150
Resized height	150
512 × 512 tiles	1 × 1
Total tiles	1
Base tokens	85
Tile tokens	170 × 1 = 170
Total tokens	255
Total price	$0.00255

Pricing

Language models

GPT-4 Turbo

Vision pricing calculator

GPT-4

GPT-3.5 Turbo

Assistants API

Fine-tuning models

Embedding models

Base models

Other models

Image models

Audio models

Older models

Simple and flexible

Built with OpenAI

Morgan Stanley

Stripe

FAQ

Pricing

Quick links

Language models

GPT-4 Turbo

gpt-4-turbo-2024-04-09

Vision pricing calculator

GPT-4

gpt-4

gpt-4-32k

GPT-3.5 Turbo

gpt-3.5-turbo-0125

gpt-3.5-turbo-instruct

Assistants API

Code interpreter

File Search

Fine-tuning models

gpt-3.5-turbo

davinci-002

babbage-002

Embedding models

text-embedding-3-small

text-embedding-3-large

ada v2

Base models

davinci-002

babbage-002

Other models

Image models

DALL·E 3

DALL·E 3

DALL·E 2

Audio models

Whisper

TTS

TTS HD

Older models

gpt-4-0125-preview

gpt-4-1106-preview

gpt-4-vision-preview

gpt-3.5-turbo-1106

gpt-3.5-turbo-0613

gpt-3.5-turbo-16k-0613

gpt-3.5-turbo-0301

Simple and flexible

Built with OpenAI

Morgan Stanley

Stripe

FAQ

What’s a token?

Which model should I use?

How will I know how many tokens I’ve used each month?

How can I manage my spending?

Is API access included in the ChatGPT Plus subscription?

Does Playground usage count against my quota?

How is pricing calculated for Completions?

How is pricing calculated for Fine-tuning?

Is there an SLA on the various models?

Is the API available on Microsoft Azure?