Starting today, all paying API customers have access to GPT-4. In March, we introduced the ChatGPT API, and earlier this month we released our first updates to the chat-based models. We envision a future where chat-based models can support any use case. Today we’re announcing a deprecation plan for older models of the Completions API, and recommend that users adopt the Chat Completions API.
GPT-4 API general availability
GPT-4 is our most capable model. Millions of developers have requested access to the GPT-4 API since March, and the range of innovative products leveraging GPT-4 is growing every day. Today all existing API developers with a history of successful payments can access the GPT-4 API with 8K context. We plan to open up access to new developers by the end of this month, and then start raising rate-limits after that depending on compute availability.
Based on the stability and readiness of these models for production-scale use, we are also making the GPT-3.5 Turbo, DALL·E and Whisper APIs generally available. We are working on safely enabling fine-tuning for GPT-4 and GPT-3.5 Turbo and expect this feature to be available later this year.
Moving from text completions to chat completions
We introduced the Chat Completions API in March, and it now accounts for 97% of our API GPT usage.
The initial Completions API was introduced in June 2020 to provide a freeform text prompt for interacting with our language models. We’ve since learned that we can often provide better results with a more structured prompt interface. The chat-based paradigm has proven to be powerful, handling the vast majority of previous use cases and new conversational needs, while providing higher flexibility and specificity. In particular, the Chat Completions API’s structured interface (e.g., system messages, function calling) and multi-turn conversation capabilities enable developers to build conversational experiences and a broad range of completion tasks. It also helps lower the risk of prompt injection attacks, since user-provided content can be structurally separated from instructions.
We plan to continue investing most of our platform efforts in this direction, as we believe it will offer an increasingly capable and easy-to-use experience for developers. We’re working on closing the last few remaining gaps of the Chat Completions API quickly, such as log probabilities for completion tokens and increased steerability to reduce the “chattiness” of responses.
Deprecation of older models in the Completions API
As part of our increased investment in the Chat Completions API and our efforts to optimize our compute capacity, in 6 months we will be retiring some of our older models using the Completions API. While this API will remain accessible, we will label it as “legacy” in our developer documentation starting today. We plan for future model and product improvements to focus on the Chat Completions API, and do not have plans to publicly release new models using the Completions API.
Starting January 4, 2024, older completion models will no longer be available, and will be replaced with the following models:
Applications using the stable model names for base GPT-3 models (
davinci) will automatically be upgraded to the new models listed above on January 4, 2024. The new models will also be accessible in the coming weeks for early testing by specifying the following model names in API calls:
Developers using other older completion models (such as
text-davinci-003) will need to manually upgrade their integration by January 4, 2024 by specifying
gpt-3.5-turbo-instruct in the “model” parameter of their API requests.
gpt-3.5-turbo-instruct is an InstructGPT-style model, trained similarly to
text-davinci-003. This new model is a drop-in replacement in the Completions API and will be available in the coming weeks for early testing.
Developers wishing to continue using their fine-tuned models beyond January 4, 2024 will need to fine-tune replacements atop the new base GPT-3 models (
davinci-002), or newer models (
gpt-4). Once this feature is available later this year, we will give priority access to GPT-3.5 Turbo and GPT-4 fine-tuning to users who previously fine-tuned older models. We acknowledge that migrating off of models that are fine-tuned on your own data is challenging. We will be providing support to users who previously fine-tuned models to make this transition as smooth as possible.
In the coming weeks, we will reach out to developers who have recently used these older models, and will provide more information once the new completion models are ready for early testing.
Deprecation of older embeddings models
Users of older embeddings models (e.g.,
text-search-davinci-doc-001) will need to migrate to
text-embedding-ada-002 by January 4, 2024. We released
text-embedding-ada-002 in December 2022, and have found it more capable and cost effective than previous models. Today
text-embedding-ada-002 accounts for 99.9% of all embedding API usage.
We recognize this is a significant change for developers using those older models. Winding down these models is not a decision we are making lightly. We will cover the financial cost of users re-embedding content with these new models. We will be in touch with impacted users over the coming days.
Deprecation of the Edits API
Users of the Edits API and its associated models (e.g.,
code-davinci-edit-001) will need to migrate to GPT-3.5 Turbo by January 4, 2024. The Edits API beta was an early exploratory API, meant to enable developers to return an edited version of the prompt based on instructions. We took the feedback from the Edits API into account when developing
gpt-3.5-turbo and the Chat Completions API, which can now be used for the same purpose: