OpenAI's Reinforcement Fine-Tuning Research Program
We’re expanding our Reinforcement Fine-Tuning Research Program to enable developers and machine learning engineers to create expert models fine-tuned to excel at specific sets of complex, domain-specific tasks.
What is Reinforcement Fine-Tuning?
This new model customization technique enables developers to customize our models using dozens to thousands of high quality tasks and grade the model’s response with provided reference answers. This technique reinforces how the model reasons through similar problems and improves its accuracy on specific tasks in that domain.
Who should apply?
We encourage research institutes, universities, and enterprises to apply, particularly those that currently execute narrow sets of complex tasks led by experts and would benefit from AI assistance. We’ve seen promising results in domains like Law, Insurance, Healthcare, Finance, and Engineering because Reinforcement Fine-Tuning excels at tasks where the outcome has an objectively “correct” answer that most experts would agree with.
What does the program entail?
As part of the research program, you will get access to our Reinforcement Fine-Tuning API in alpha to test this technique on your domain-specific tasks. You will be asked to provide feedback to help us improve the API ahead of a public release. We’re eager to collaborate with organizations that choose to share their datasets to help improve our models.
If you are interested and think you are a fit for this program, please complete the form below to apply. We have a limited number of spots available and we will be in touch about the status of your application. We look forward to making Reinforcement Fine-Tuning publicly available in early 2025.