Skip to main content

December 6, 2024

OpenAI's Reinforcement Fine-Tuning Research Program

We’re expanding our Reinforcement Fine-Tuning Research Program to enable developers and machine learning engineers to create expert models fine-tuned to excel at specific sets of complex, domain-specific tasks.

What is Reinforcement Fine-Tuning?

This new model customization technique enables developers to customize our models using dozens to thousands of high quality tasks and grade the model’s response with provided reference answers. This technique reinforces how the model reasons through similar problems and improves its accuracy on specific tasks in that domain.

Who should apply?

We encourage research institutes, universities, and enterprises to apply, particularly those that currently execute narrow sets of complex tasks led by experts and would benefit from AI assistance. We’ve seen promising results in domains like Law, Insurance, Healthcare, Finance, and Engineering because Reinforcement Fine-Tuning excels at tasks where the outcome has an objectively “correct” answer that most experts would agree with.

What does the program entail?

As part of the research program, you will get access to our Reinforcement Fine-Tuning API in alpha to test this technique on your domain-specific tasks. You will be asked to provide feedback to help us improve the API ahead of a public release. We’re eager to collaborate with organizations that choose to share their datasets to help improve our models.

If you are interested and think you are a fit for this program, please complete the form below to apply. We have a limited number of spots available and we will be in touch about the status of your application. We look forward to making Reinforcement Fine-Tuning publicly available in early 2025.

Please select which best describes your organization: *
Please select which best describes your domain: *
We’re interested in learning about any methods you have already tried. Which models have you used to try to solve this problem? (Select all that apply.)*
Do you have a team of developers or machine learning engineers that would build with the Reinforcement Fine-Tuning API? *
We will be prioritizing organizations who are willing to share their dataset with OpenAI to help improve our models. Would you be willing to share your dataset as part of this alpha? *