16 апреля 2025 г.

Introducing OpenAI o3 and o4-mini

Загрузка…

Update on June 10, 2025: OpenAI o3‑pro is now available to Pro users in ChatGPT, as well as in our API. Like OpenAI o1‑pro, o3‑pro is a version of our most intelligent model, OpenAI o3, designed to think longer and provide the most reliable responses. Full details can be found in our release notes⁠(открывается в новом окне).

Today, we’re releasing OpenAI o3 and o4-mini, the latest in our o-series of models trained to think for longer before responding. These are the smartest models we’ve released to date, representing a step change in ChatGPT's capabilities for everyone from curious users to advanced researchers. For the first time, our reasoning models can agentically use and combine every tool within ChatGPT—this includes searching the web, analyzing uploaded files and other data with Python, reasoning deeply about visual inputs, and even generating images. Critically, these models are trained to reason about when and how to use tools to produce detailed and thoughtful answers in the right output formats, typically in under a minute, to solve more complex problems. This allows them to tackle multi-faceted questions more effectively, a step toward a more agentic ChatGPT that can independently execute tasks on your behalf. The combined power of state-of-the-art reasoning with full tool access translates into significantly stronger performance across academic benchmarks and real-world tasks, setting a new standard in both intelligence and usefulness.

Что изменилось

OpenAI o3 — наша самая мощная модель рассуждений, которая расширяет границы возможностей в сфере программирования, математики, науки, визуального восприятия и других областях. Она устанавливает новый стандарт SOTA на таких тестах, как Codeforces, SWE-bench (без создания специализированного каркаса для модели) и MMMU. Она идеально подходит для выполнения сложных запросов, требующих многогранного анализа, и ответы на которые могут быть не сразу очевидны. Особенно хорошо она справляется с визуальными задачами, такими как анализ изображений, диаграмм и графиков. В оценках внешних экспертов o3 допускает на 20% меньше крупных ошибок, чем OpenAI o1 в сложных реальных задачах — особенно преуспевая в таких областях, как программирование, бизнес-консалтинг и творческое мышление. Ранние тестировщики отмечали аналитическую строгость модели как интеллектуального партнёра и подчеркнули способность формировать и критически оценивать новые гипотезы, особенно в контекстах биологии, математики и инженерии.

OpenAI o4-mini — это уменьшенная модель, оптимизированная для быстрых и рациональных рассуждений: она демонстрирует выдающуюся эффективность для своего размера и стоимости, особенно в математических, программных и визуальных задачах. Это — лучшая эталонная модель по оценкам AIME 2024 и 2025. Хотя доступ к компьютеру значительно снижает сложность экзамена AIME, мы также отметили, что o4-mini достигает 99,5% pass@1 (100% consensus@8) на AIME 2025 при доступе к интерпретатору Python. Хотя эти результаты не следует сравнивать с производительностью моделей без доступа к инструментам, они являются примером того, насколько эффективно o4-mini использует доступные инструменты; o3 демонстрирует аналогичные улучшения на AIME 2025 благодаря использованию инструментов (98,4% pass@1, 100% consensus@8).

По оценкам экспертов, o4-mini также превосходит своего предшественника o3‑mini в задачах, не относящихся к STEM, а также в таких сферах, как наука о данных. Благодаря своей эффективности o4-mini поддерживает значительно более высокие лимиты использования, чем o3, что делает ее сильным вариантом с большим объёмом и высокой пропускной способностью для вопросов, для которых требуется больше рассуждений. Внешние эксперты оценили обе модели как демонстрирующие улучшенное выполнение инструкций и более полезные, проверяемые ответы, чем их предшественники, благодаря улучшенному интеллекту и использованию веб-источников. По сравнению с предыдущими итерациями наших моделей рассуждений, эти две модели также должны казаться более естественными при разговоре, особенно когда они ссылаются на память и прошлые обсуждения, чтобы сделать ответы более персонализированными и релевантными.

Мультимодальность

Кодирование

Все оценочные прогоны SWE-bench используют фиксированное подмножество из n=477 проверенных задач, которые были проверены на нашей внутренней инфраструктуре.

Следование инструкциям и использование агентских инструментов

Все модели проходили оценку при высоком «уровне рассуждений», аналогично таким вариантам, как «o4-mini-high» в ChatGPT.

Continuing to scale reinforcement learning

Throughout the development of OpenAI o3, we’ve observed that large-scale reinforcement learning exhibits the same “more compute = better performance” trend observed in GPT‑series pretraining. By retracing the scaling path—this time in RL—we’ve pushed an additional order of magnitude in both training compute and inference-time reasoning, yet still see clear performance gains, validating that the models’ performance continues to improve the more they’re allowed to think. At equal latency and cost with OpenAI o1, o3 delivers higher performance in ChatGPT—and we've validated that if we let it think longer, its performance keeps climbing.

We also trained both models to use tools through reinforcement learning—teaching them not just how to use tools, but to reason about when to use them. Their ability to deploy tools based on desired outcomes makes them more capable in open-ended situations—particularly those involving visual reasoning and multi-step workflows. This improvement is reflected both in academic benchmarks and real-world tasks, as reported by early testers.

Thinking with images

For the first time, these models can integrate images directly into their chain of thought. They don’t just see an image—they think with it. This unlocks a new class of problem-solving that blends visual and textual reasoning, reflected in their state-of-the-art performance across multimodal benchmarks.

People can upload a photo of a whiteboard, a textbook diagram, or a hand-drawn sketch, and the model can interpret it—even if the image is blurry, reversed, or low quality. With tool use, the models can manipulate images on the fly—rotating, zooming, or transforming them as part of their reasoning process.

These models deliver best-in-class accuracy on visual perception tasks, enabling it to solve questions that were previously out of reach. Check out the visual reasoning research blog⁠ to learn more.

Использование агентных инструментов

OpenAI o3 и o4-mini имеют полный доступ к инструментам внутри ChatGPT, а также к вашим собственным инструментам через вызов функций в API. Эти модели обучены рассуждать о том, как решать задачи, выбирая, когда и как использовать инструменты для получения подробных и продуманных ответов в нужных форматах быстро — обычно менее чем за минуту.

Например, пользователь может спросить: «Как будет выглядеть летнее потребление энергии в Калифорнии в сравнении с прошлым годом?». Модель может поискать в интернете данные о коммунальных услугах, написать код на Python для построения прогноза, сгенерировать график или изображение и объяснить ключевые факторы, лежащие в основе прогноза, объединив несколько вызовов инструментов. Рассуждение позволяет моделям реагировать и при необходимости менять ход беседы в зависимости от полученной информации. Например, они могут выполнять многократный поиск в Интернете с помощью поисковых систем, просматривать результаты и пробовать новые запросы, если им нужна дополнительная информация.

Этот гибкий стратегический подход позволяет моделям решать задачи, которые требуют доступа к актуальной информации, выходящей за пределы встроенных знаний модели, расширенного рассуждения, синтеза и генерации результатов в различных модальностях.

Все примеры были выполнены с OpenAI o3.

OpenAI o3

Solve the following very hard math problem using your in-built knowledge of the literature:

Construct a degree 19 polynomial \ (p(x)) \ in \ mathbb{C}[x] such that \ (X := \ {\, p(x) = p(y) \, \} \ subset \ mathbb{P}^{1} \ times \ mathbb{P}^{1}\ ) has at least 3 (but not all linear) irreducible components over \ mathbb{C}. Choose \ (p(x)\ ) to be odd, monic, have real coefficients and linear coefficient \ (-19\ ) and calculate \ (p(19)\ ).

Размышление о построении полинома — 55 секунд

The curve defined by p(x) = p(y) factors if p(x) is decomposable or has repeated factors. Actually, p(x)-p(y) always factors, and it’s divisible by (x-y). This gives one factor, and the rest is a degree d-1 polynomial in x and y. For the curve to have at least three irreducible components, p(x)-p(y) must factor into at least three nontrivial factors. Each factor corresponds to a component in projective space, and at least one must be nonlinear.
I’m checking the Dickson polynomials D_n(x,a) and focusing on their coefficients. With a=1, it seems the leading coefficient is always 1, confirming that Dickson polynomials are monic. The linear coefficient for odd n is following a predictable sign pattern: for n=19, it gives a coefficient of -19, which is exactly what I need. So the Dickson polynomial D_n(x,1) with odd n fits nicely with the linear coefficient I was looking for.
I've confirmed that the linear coefficient for p(x)=D19(x)p(x) = D_{19}(x)p(x)=D19(x) is correct, with the necessary scaling factor to match. Now for the next step, I need to compute D19(19)D_{19}(19)D19(19). Since Tn(z)T_n(z)Tn(z) grows quickly for real z>1z > 1z>1, I plan to compute this value using Python, leveraging the recurrence to avoid mistakes—it's going to be a big number. I'll make sure to present the numeric result neatly so it's understandable since the coefficients are integers.
The Dickson polynomial D_{19}(x,1) is confirmed to have the required properties: monic, odd, real coefficients, and a linear coefficient of -19. The factorization of D_{19}(x) into irreducible components is also verified and satisfies the condition of having at least 10 components (1 linear + 9 quadratics). To compute p(19), we'll use the recurrence relation for Dickson polynomials, but there's also a closed-form expression involving cosine that could be handy here. I'll proceed accordingly!
I confirmed the linear coefficient is working correctly, which means now I need to calculate p(19)=D19(19)p(19) = D_{19}(19)p(19)=D19(19). This involves evaluating T19(9.5)T_{19}(9.5)T19(9.5), which grows rapidly. To ensure I get an accurate result, I'll compute this using recurrence in Python. The result is a large number: 1,876,572,071,974,094,803,391,179. I'll use the commas to make it easier to read: p(19)=1,876,572,071,974,094,803,391,179p(19) = 1,876,572,071,974,094,803,391,179p(19)=1,876,572,071,974,094,803,391,179.
For a = 1, there’s a relationship between Dickson polynomials and Chebyshev polynomials: D_n(x, 1) = 2^(n-1) T_n(x/2). Checking this, we see that the definition of D_n(u + 1/u) aligns well with Chebyshev’s formula. By setting x = 2 cos(θ), we find that D_n(x) equals 2 T_n(x/2). Importantly, the leading coefficient works out to 1, confirming the identity holds. This is nice!
Now, I need to verify that p(x)−p(y)p(x) - p(y)p(x)−p(y) factors as desired, relying on the known Dickson polynomial factorization. According to the literature (like Fried and Jarden), the difference Dn(x)−Dn(y)D_n(x) - D_n(y)Dn(x)−Dn(y) factors into a product involving (x−y)(x - y)(x−y) and other expressions. I'll provide the explicit polynomial D19(x,1)D_{19}(x, 1)D19(x,1), list the coefficients, and explain how it satisfies the desired properties: monic, odd, etc. Lastly, I'll compute p(19)p(19)p(19).

Analyzed

Python

1from math import comb, cos, acos
2# Let's compute Dickson D_n(x,a=1) recursively
3def dickson(n,x,a=1):
4    if n==0:
5        return 2
6    if n==1:
7        return x
8    Dm2=2
9    Dm1=x
10    for k in range(2,n+1):
11        D= x*Dm1 - a*Dm2
12        Dm2, Dm1 = Dm1, D
13    return Dm1
14
15d19 = dickson(19,19,1)
16d19

1876572071974094803391179

Analyzed

Python

1d19

1876572071974094803391179

Analyzed

Python

1import locale, sys, math, json, textwrap, re, decimal

Analyzed

Python

1"{:,}".format(d19)

'1,876,572,071,974,094,803,391,179'

Analyzed

Python

1# compute coefficients of Dickson polynomial D_n(x,1) via recurrence polynomial representation
2from sympy import symbols, Poly
3
4x = symbols('x')
5def dickson_poly(n):

Poly(x**19 - 19*x**17 + 152*x**15 - 665*x**13 + 1729*x**11 - 2717*x**9 + 2508*x**7 - 1254*x**5 + 285*x**3 - 19*x, x,

Analyzed

Python

1poly19.as_expr()

x**19 - 19*x**17 + 152*x**15 - 665*x**13 + 1729*x**11 - 2717*x**9 + 2508*x**7 - 1254*x**5 + 285*x**3 - 19*x

Done

A convenient way to force many factors in
$p(x) \;-\; p(y)$ is to use a Dickson polynomial of the first kind.
For a parameter $a \in \mathbb{C}$ in math these are defined by

$D_0\!\left(x,\, a\right) = 2,\; D_1\!\left(x,\, a\right) = x,\; D_n\!\left(x,\, a\right) = x\,D_{n-1}\!\left(x,\, a\right) - a\,D_{n-2}\!\left(x,\, a\right)\quad (n \ge 2),$

and satisfy the functional identity

$D_n\!\left(u + \frac{a}{u},\, a\right) = u^{n} + \frac{a^{n}}{u^{n}}.$

1. Choosing the polynomial

Take $a = 1$ and $n=19$ .
Because $n$ is odd, $D_{19}\!\left(x,\, 1\right)$ is an odd polynomial; the recurrence immediately shows it is monic and has integer (hence real) coefficients.
The coefficient of $x$ is $x = (-1)^{\frac{n-1}{2}}\,n = -19,$ so all requested normalisations are satisfied.

Writing it out gives

$\boxed{% p(x) = D_{19}\!\left(x,1\right) = x^{19} - 19x^{17} + 152x^{15} - 665x^{13} + 1729x^{11} - 2717x^{9} + 2508x^{7} - 1254x^{5} + 285x^{3} - 19x% }\,.$

2. Irreducible components of $fX \;=\; \bigl\{\, p(x) = p(y) \,\bigr\}$

For Dickson polynomials one has the classical factorisation

$D_n\!\left(x,1\right) \;-\; D_n\!\left(y,1\right) \;=\; (x - y)\, \prod_{j=1}^{\frac{n-1}{2}} \Bigl( x^{2} + y^{2} - 2\cos\!\bigl(\tfrac{2\pi j}{n}\bigr)\,xy - 4\sin^{2}\!\bigl(\tfrac{\pi j}{n}\bigr) \Bigr),$

where each quadratic factor is irreducible over $\mathbb{C}$ (see, e.g., Lidl–Niederreiter, Finite Fields, §3.6).
For $n=19$ this gives

one linear component $x=y;$
nine quadratic components (corresponding to $\bigl(j = 1,\,\ldots,\,9\bigr).$

Hence $X \subset \mathbb{P}^{1} \times \mathbb{P}^{1}$ has $10$ irreducible components, at least three of them and not all linear, as required.

3. Evaluating $p(19)$

Using the recurrence (or the identity $D_n\!\left(x,\,1\right) \;=\; 2\,T_n\!\bigl(\tfrac{x}{2}\bigr)$ with Chebyshev $T_n$ one obtains

$p(19) \;=\; D_{19}\!\left(19,\,1\right) \;=\; 1\,876\,572\,071\,974\,094\,803\,391\,179.$

Thus the polynomial above fulfils every condition, and

$\boxed{% p(19) \;=\; 1{,}876{,}572{,}071{,}974{,}094{,}803{,}391{,}179 }.$

OpenAI o1

Solve the following very hard math problem using your in-built knowledge of the literature:

Размышление в течение 1 м 19 с

I’m piecing together a degree 19 polynomial with only odd powers and a linear coefficient of -19, like: p(x)=x19+a17x17+…+a1xp(x) = x^{19} + a_{17}x^{17} + \ldots + a_1xp(x)=x19+a17x17+…+a1x.

I'm examining if the polynomial p(x)=x19+a17x17+…+a3x3−19xp(x) = x^{19} + a_{17}x^{17} + \ldots + a_3x^3 - 19xp(x)=x19+a17x17+…+a3x3−19x has a notable classical form, considering symmetry or factorization for multiple irreducible components.

To ensure p(x)p(x)p(x) has three or more irreducible components, I’m considering the factorization of p(x)−p(y)p(x) - p(y)p(x)−p(y) into at least three irreducible factors in C[x,y]\mathbb{C}[x,y]C[x,y].

I'm working through x19−y19x^{19} - y^{19}x19−y19 to ensure it balances linear and nonlinear factors, aiming for a polynomial with at least three irreducible components that aren't all linear.

Taking a closer look at how p(x)−p(y)p(x) - p(y)p(x)−p(y) can be factored into three distinct polynomials in xxx, aiming to achieve the desired irreducible components.

I'm thinking about how composing ppp with simpler polynomials, like xm−cx^m - cxm−c, could result in multiple irreducible components, achieving at least three distinct factors.

I’m exploring the factorization of p(x)−p(y)p(x) - p(y)p(x)−p(y) with cyclotomic polynomials. This method generates nnn linear irreducible factors, which might not achieve the required three distinct components.

I'm focusing on constructing an "odd" and monic polynomial, ensuring all exponents are odd and the leading term is x19x^{19}x19. This approach aims to achieve the desired factorization.

I'm working with the Chebyshev polynomials to ensure p(x)−p(y)p(x) - p(y)p(x)−p(y) has at least three irreducible components, leveraging their even or odd properties to manage coefficients.

Considering the odd Chebyshev polynomial T19(x)T_{19}(x)T19(x) for potential factorization, while also examining composite polynomials of lower degrees to enhance factorization properties.

I'm analyzing how a polynomial composed of lower-degree polynomials, like (x3−3x)k(x^3 - 3x)^k(x3−3x)k, could yield p(x)−p(y)p(x) - p(y)p(x)−p(y) with at least three irreducible components.

Piecing together polynomials with integer coefficients, like (x3−3x)k(x^3 - 3x)^k(x3−3x)k, to achieve factorization in p(x)−p(y)p(x) - p(y)p(x)−p(y) with at least three irreducible components, aiming for degree 19.

I’m examining how p(x)=(x3−3x)kp(x) = (x^3 - 3x)^kp(x)=(x3−3x)k or p(x)=x19−19xp(x) = x^19 - 19xp(x)=x19−19x might yield p(x)−p(y)p(x) - p(y)p(x)−p(y) with three irreducible components, focusing on ensuring the polynomial is of degree 19.

I’m analyzing p(x)−p(y)p(x) - p(y)p(x)−p(y) into (x−y)(x18+x17y+…+y18−19)(x-y)(x^{18} + x^{17}y + \ldots + y^{18} - 19)(x−y)(x18+x17y+…+y18−19) to identify irreducible components. The polynomial factor x18+x17y+…+y18−19x^{18} + x^{17}y + \ldots + y^{18} - 19x18+x17y+…+y18−19 remains challenging to factor further.

I’m thinking through a new approach: adjusting the Chebyshev polynomial T19(x)T_{19}(x)T19(x) to make it monic, aiming to achieve at least three irreducible components in p(x)−p(y)p(x) - p(y)p(x)−p(y).

OK, let's think through. I’m tweaking T19(x)T_{19}(x)T19(x)'s coefficients to keep the polynomial monic and check if it simplifies into multiple irreducible components.

I'm thinking over various factorization strategies for a monic polynomial of degree 19. The goal is to ensure at least one non-linear factor and achieve at least three irreducible components in p(x)−p(y)p(x) - p(y)p(x)−p(y).

I’m updating my approach to factor x2+xy+y2−1x^2 + xy + y^2 - 1x2+xy+y2−1 over complex numbers by exploring change of variables and third roots of unity. This helps in achieving the desired polynomial structure.

I'm combining x3−xx^3 - xx3−x with x19−19xx^{19} - 19xx19−19x to tweak the polynomial's coefficients and complexity, aiming to include more non-linear factors for the desired irreducibility.

Piecing together p(x)=x19−x3−19xp(x) = x^{19} - x^3 - 19xp(x)=x19−x3−19x to ensure it’s odd, monic, and has the required linear coefficient. This targets the desired irreducible factorizations.

I’m mapping out the second factor x18+…+y18−x2−xy−y2−19x^{18} + \ldots + y^{18} - x^2 - xy - y^2 - 19x18+…+y18−x2−xy−y2−19 to see if it can split into three irreducible components.

I’m considering methods to factorize the polynomial by making it divisible by x−yx - yx−y and ensuring the remaining factor splits into exactly two irreducible components, aiming for a less complex factorization.

I’m examining the polynomial x18+x17y+…+y18x^{18} + x^{17}y + \ldots + y^{18}x18+x17y+…+y18 and noting it simplifies to x19−y19x−y\frac{x^{19} - y^{19}}{x - y}x−yx19−y19, which might reveal further irreducible factors.

I’m mapping out the polynomial x(x18−1)−y(y18−1)x−y\frac{x(x^{18}-1) - y(y^{18}-1)}{x-y}x−yx(x18−1)−y(y18−1) for further factorization. This might yield just two irreducible components, which falls short of three.

I’m exploring whether the polynomial p(x)=x19−(x3)p(x) = x^19 - (x^3)p(x)=x19−(x3) can be expressed as the composition of a smaller polynomial with xkx^kxk, potentially revealing multiple irreducible components.

I'm examining if the polynomial T19(x)T_{19}(x)T19(x) can be factored using trigonometric identities, seeking partial factorization without fully linearizing, to gain multiple irreducible components.

OK, let me see. Trigonometric identities show multiple solutions for θ\thetaθ and ϕ\phiϕ, likely leading to 19 or 38 linear factors for T19(x)T_{19}(x)T19(x).

I’m examining how the equation arccos⁡x=±arccos⁡y+2kπ19\arccos x = \pm \arccos y + \frac{2k\pi}{19}arccosx=±arccosy+192kπ leads to possible multiple branches, potentially irreducible in the algebraic sense, and needing an example with at least three non-linear irreducible components.

OK, let me see. I’m pulling together details on the polynomial x3(x2−1)8+xx^3 (x^2 - 1)^8 + xx3(x2−1)8+x. It’s monic, with x19x^{19}x19 as the leading term, ensuring it meets key criteria.

I’m noting that the polynomial p(x)=x3(x2−1)8+xp(x) = x^3 (x^2 - 1)^8 + xp(x)=x3(x2−1)8+x is odd, with its linear coefficient patterning to -19.

I’m figuring out how to adjust the linear coefficient of polynomial p(x)p(x)p(x) to -19 by adding a constant factor ccc in front of xxx to ensure it’s adequately modified.

By tweaking the linear coefficient of p(x)=x3(x2−1)8+cxp(x) = x^3 (x^2 - 1)^8 + c xp(x)=x3(x2−1)8+cx to -19, it ensures the polynomial remains monic, odd, and real.

I’m working on adding a constant factor ccc to xxx in p(x)=x3(x2−1)8+cxp(x) = x^3 (x^2 - 1)^8 + c xp(x)=x3(x2−1)8+cx to ensure the linear coefficient is -19, making the polynomial monic, odd, and real.

I'm piecing together the polynomial p(x)=x3(x−1)8(x+1)8p(x) = x^3 (x-1)^8 (x+1)^8p(x)=x3(x−1)8(x+1)8 and considering how p(x)−p(y)p(x) - p(y)p(x)−p(y) might factor with factors like x−yx-yx−y.

I’m confirming p(x)p(x)p(x) is monic, odd, and real, with a linear coefficient of -19, by expanding x3(x2−1)8x^3 (x^2 -1)^8x3(x2−1)8 to set the constant ccc.

I found that for p(x) to have a coefficient of -19 for the x term, c must be -19. The polynomial, p(x), incorporates this constant and preserves the original polynomial's structure.

Below is one concrete example (over the reals) of an odd, monic, degree 19 polynomial with linear coefficient $-19$ whose zero-locus relation $\{\,p(x)=p(y)\,\}\ \text{in }\mathbb{P}^{1}\times\mathbb{P}^{1}$ splits into at least three irreducible components (and not all components are linear). We then compute $p(19)$ .

1. The polynomial

Consider

$p(x) \;=\; x^{3}\,\bigl(x^{2}-1\bigr)^{8} \;-\; 19x.$

We can expand

$x^{3}\,\bigl(x^{2}-1\bigr)^{8} \;=\; x^{19}\;-\;8x^{17}\;+\;28x^{15}\;-\;56x^{13}\;+\;70x^{11} \;-\;56x^{9}\;+\;28x^{7}\;-\;8x^{5}\;+\;x^{3}.$

Hence

$p(x) \;=\; x^{19} \;-\; 8x^{17} \;+\; 28x^{15} \;-\; 56x^{13} \;+\; 70x^{11} \;-\; 56x^{9} \;+\; 28x^{7} \;-\; 8x^{5} \;+\; x^{3} \;-\; 19x.$

Checking the required properties

Degree 19, monic:
The leading term is $x^{19}$ with coefficient $1$ .

Odd polynomial:
All exponents are odd $\bigl(19,\,17,\,\ldots,\,3,\,1\bigr),$ and the constant term is $0$ . Indeed,

$p(-x) \;=\; -\,p(x),$

which makes $p$ an odd polynomial.

Real coefficients:
All coefficients are real integers.

Linear coefficient $-19$ :
The coefficient of $x$ is clearly $-19$ .

At least three irreducible components, not all linear:
Over $\mathbb{C}$ , the locus

$X \;=\; \bigl\{\, (x,y) : p(x)=p(y) \,\bigr\} \;\subset\; \mathbb{P}^{1}\times\mathbb{P}^{1}$

factors in part because $p(x)$ has (for instance) the repeated factors $\bigl(x^{2}-1\bigr)^{8}$ in the big piece $x^{3}\,\bigl(x^{2}-1\bigr)^{8}$ . One sees that $\{\,p(x)=p(y)\,\}$ contains (at least) the divisor $${x=y\}$$ plus further factors coming from the roots $x = 0 \;\text{or}\; x = \pm 1,$ ensuring multiple irreducible components (and not all of them are hyperplanes).

2. Computing $p(19)$

A direct substitution $x=19$ gives a large integer. Symbolic expansion or a computer algebra system yields

$p(19) \;=\; 1934999285524070399999639.$

Hence the required value is

\boxed{\,p(19) \;=\; 1934999285524070399999639\,}. \] :contentReference[oaicite:0]{index=0}

OpenAI o3 получает ответ правильно без использования поиска, тогда как o1 не может предоставить правильный ответ.

Рациональное рассуждение

Стоимость и производительность: o3‑mini и o4-mini

Оценки AIME для OpenAI o3-mini и o4-mini

Оценки GPQA для OpenAI o3-mini и o4-mini

Стоимость и производительность: o1 и o3

o3 и o4-mini от OpenAI — самые интеллектуальные модели, которые мы когда-либо выпускали: в большинстве случаев они более эффективны, чем их предшественники OpenAI o1 и o3‑mini. Например, на математическом конкурсе AIME 2025 года соотношение затрат и производительности для o3 было серьезно улучшено по сравнению с o1; аналогичным образом, соотношение для o4-mini было серьезно улучшено по сравнению с o3‑mini. В более общем смысле мы ожидаем, что для большинства реальных применений o3 и o4-mini также будут как умнее, так и дешевле, чем o1 и o3‑mini соответственно.

Safety

Each improvement in model capabilities warrants commensurate improvements to safety. For OpenAI o3 and o4-mini, we completely rebuilt our safety training data, adding new refusal prompts in areas such as biological threats (biorisk), malware generation, and jailbreaks. This refreshed data has led o3 and o4-mini to achieve strong performance on our internal refusal benchmarks (e.g., instruction hierarchy⁠, jailbreaks). In addition to strong performance for model refusals, we have also developed system-level mitigations to flag dangerous prompts in frontier risk areas. Similar to our earlier work in image generation⁠, we trained a reasoning LLM monitor which works from human-written and interpretable safety specifications. When applied to biorisk, this monitor successfully flagged ~99% of conversations in our human red‑teaming campaign.

We stress tested both models with our most rigorous safety program to date. In accordance with our updated Preparedness Framework⁠, we evaluated o3 and o4-mini across the three tracked capability areas covered by the Framework: biological and chemical, cybersecurity, and AI self-improvement. Based on the results of these evaluations, we have determined that both o3 and o4‑mini remain below the Framework's "High" threshold in all three categories. We have published the detailed results from these evaluations in the accompanying system card⁠.⁠

Codex CLI: frontier reasoning in the terminal

We’re also sharing a new experiment: Codex CLI, a lightweight coding agent you can run from your terminal. It works directly on your computer and is designed to maximize the reasoning capabilities of models like o3 and o4-mini, with upcoming support for additional API models like GPT‑4.1⁠.

You can get the benefits of multimodal reasoning from the command line by passing screenshots or low fidelity sketches to the model, combined with access to your code locally. We think of it as a minimal interface to connect our models to users and their computers. Codex CLI is fully open-source at github.com/openai/codex⁠(открывается в новом окне) today.

Alongside, we are launching a $1 million initiative to support projects using Codex CLI and OpenAI models. We will evaluate and accept applications for grants in increments of $25,000 USD in the form of API credits. Proposals can be submitted here.

Доступ

Начиная с сегодняшнего дня пользователи ChatGPT Plus, Pro и Team увидят в селекторе моделей o3, o4-mini и o4-mini-high, которые заменят o1, o3‑mini и o3‑mini‑high. Пользователи ChatGPT Enterprise и Edu получат доступ через одну неделю. Пользователи плана Free могут попробовать o4-mini, отметив вариант «Думать» в редакторе перед отправкой запроса. Лимиты скорости по всем тарифным планам остались без изменений по сравнению с предыдущим набором моделей.

Через несколько недель мы планируем выпустить OpenAI o3‑pro с полной поддержкой инструментов. Пока что пользователи Pro по-прежнему могут получить доступ к o1‑pro.

o3 и o4-mini также доступны разработчикам сегодня через API Chat Completions и API Responses (некоторым разработчикам потребуется пройти верификацию своих организаций⁠(открывается в новом окне) для доступа к этим моделям). API Responses поддерживает краткое содержание рассуждений, возможность сохранять токены рассуждений с вызовами функций для повышения производительности, а также вскоре будет поддерживать встроенные инструменты, такие как веб-поиск, поиск файлов и интерпретатор кода в рамках рассуждений модели. Следите за новостями! Пока что вы можете ознакомиться с нашей документацией⁠(открывается в новом окне).

What's next

Today's updates reflect the direction our models are heading in: we’re converging the specialized reasoning capabilities of the o-series with more of the natural conversational abilities and tool use of the GPT‑series. By unifying these strengths, our future models will support seamless, natural conversations alongside proactive tool use and advanced problem-solving.

Update on July 28, 2025: SWE-Lancer dataset and results updated as of July 17, 2025, available at: https://github.com/openai/preparedness⁠(открывается в новом окне) and in our system cards. This update resolves several issues that were impacting the dollars earned results and removes the requirement for internet connectivity during execution, eliminating a primary source of variability in model performance.

Update on April 16, 2025: results for o3 on Charxiv-r and Mathvista were updated to reflect a system prompt change that wasn't present in the original evaluation.

Повтор трансляции в прямом эфире

Автор

OpenAI

Сноски

* Результаты оценки tau-bench усредняются по 5 запускам для снижения дисперсии и выполняются без использования специальных инструментов или подсказок. Мы обнаружили, что розничные версии tau-bench более подвержены ошибкам пользовательской модели. Результаты для заштрихованной области получены с использованием GPT-4.1 в качестве пользовательской модели, поскольку она значительно лучше GPT-4o в следовании инструкциям.

* SWE-bench использует максимальную длину контекста 256 тыс., что улучшает скорость решения для o4-mini примерно на 3% и влияет на скорость решения o3 менее чем на 1%. Мы также исключили 23 образца, которые нельзя запустить на нашей внутренней инфраструктуре.

** При включении режима просмотра модель иногда может находить точные ответы в интернете — например, читая блоги с примерами задач из набора данных. Мы снижаем опасения по поводу обмана модели при просмотре, используя две стратегии:

Блокирование доменов, в связи с которыми наблюдался обман со стороны модели ранее.
Модель рассуждений использовалась в качестве контролера для проверки всех токенов в каждой попытке, чтобы выявить подозрительное поведение. Подозрительное поведение определяется как «страница, файл или фрагмент, основная цель которых — предоставить точный ответ на данный вопрос: например, официальный ключ оценки, утечка «решений» или обсуждение, дословно цитирующее готовый ответ». Доброкачественное поведение определяется как «любой авторитетный ресурс, к которому может обратиться добросовестный человек (документация, руководства, научные статьи, авторитетные статьи), даже если он случайно содержит правильный ответ». Любые попытки, которые контролер счел подозрительными, считаются неверными. Большинство примеров, не прошедших эту проверку, представляли собой задачи, точное решение которых было доступно в нескольких интернет-источниках,не связанных с HLE.

Наконец, наши оценки при просмотре могут быть не идеально воспроизводимы в API OpenAI из-за различий в поисковых системах между ChatGPT и OpenAI API. Эти результаты предназначены для представления пользовательского опыта ChatGPT, но в зависимости от спроса конфигурация поиска может изменяться со временем.

Участники

Аадитья Сингх, Аарон Шлезингер, Адам Фрай, Адам Лерер, Адам Перельман, Адам Уокер, Ахмед Эль-Кишки, Айдан Кларк, Айдан Маклафлин, Айден Лоу, Акила Велихинда, Акшай Натан, Александр Мадри, Александра Спайра, Алекс Карпенко, Алекс Нейтц, Алекс Тачард Пассос, Алекс Вэй, Александр Прокофьев, Александр Зелински, Александра Барр, Алексей Иванов, Алексей Кристакис, Альфред Сюэ, Эллисон Там, Элли Беннетт, Элли Беннетт, Амелия Лю, Эми Макдональд Санджиде, Ананья Кумар, Андре Сарайва, Андреа Валлоне, Эндрю Чен, Эндрю Дуберстайн, Эндрю Гибиански, Эндрю Кондрих, Эндрю Таллох, Андрей Мищенко, Энди Эпплбаум, Энди Ванг, Анджела Бэк, Энни Вэй, Антинг Шен, Антуан Пелисс, Анудж Сахаран, Арун Виджайвергия, Эшли Тайра, Ашвин Наир, Ави Наяк, Авиталь Оливер, Бехруз Горбани, Белинда Труонг, Бен Соколовски, Бет Хувер, Бо Сю, Боаз Барак, Бохан Чжан, Борис Минаев, Ботао Хао, Боуэн Бейкер, Боуэн Ченг, Брэндон Маккинзи, Брэндон Ванг, Брайан Хсу, Брайан Янг, Брайан Ю, Брайан Чжан, Камилло Лугарези, Каролина Паз, Карпус Чанг, Кэри Бассин, Кэри Хадсон, Кейси Чу, Чак Ли, Чарльз Чжао, Чарли Джатт, Шарлотта Коул, Челси Восс, Чен Шэнь, Чэнсюй Чжуан, Крис Колби, Крис Холласи, Крис Кох, Кристина Каплан, Кристина Ким, Колин Рид, Колин Вэй, Кристина Шо, Д. Скалли, Дэмиен Девиль, Дэн Робертс, Дана Палми, Дейн Стаки, Дэниел Левин, Дэвид Ху, Дэвид Мартин, Дэвид Робинсон, Дэвид Сасаки, Дэвис Ву, Дерек Чен, Дибья Бхаттачарджи, Димитрис Ципрас, Дингхуа Ли, Д.Ж. Штроус, dmed Medina, Дрю Хинц, Эдди Чжан, Эдмунд Вонг, Элейн Я Ле, Эли Яни, Элизабет Проэл, Эмили Соколова, Энок Чеунг, Эри Шварц, Эрик Митчелл, Эрик Нинг, Эрик Сиглер, Эрик Уоллес, Эудженио Панеро, Эван Мэйс, Евгений Никишин, Фан Ван, Фаньюань Ли, Филиппо Расо, Фойвос Цимпурлас, Фуад Матин, Фрэнсис Сонг, Фрэнсис Чжан, Гэри Янг, Джин Оден, Джамбаттиста Параскандоло, Гильдас Шабо, Грейс Ким, Грейс Чжао, Грег Брокман, Грегори Валиант, Гийом Леклерк, Хади Салман, Хайтанг Ху, Ханна Шихан, Хао Шэн, Хаою Ван, Энрике Понде де Оливейра Пинто, Генри Аспегрен, Хэцин Ян, Хессам Багеринежад, Хунъю Жэнь, Хантер Лайтман, Хёнву Но, Иэн Кивличан, Иэн Сол, Игнаси Клавера, Икаи Лан, Илге Аккая, Илья Костриков, Ирина Кофман, Иса Фулфорд, Джейк Брилл, Якуб Пачоцки, Джеймс Беткер, Джеймс Ли, Джеймс Цин, Джейми Кирос, Джейсон Ай, Джей Ванг, Джин Харб, Джефф Микки, Джеффри Хан, Джеффри Ванг, Джереми Чен, Джерри Турек, Джессика Лян, Джессика Шие, Цзи Линь, Цзяхуэй Ю, Цзяньфэн Ван, Цзе Тан, Цзихань Инь, Цзин Ли, Джоан Чан, Джоэл Моррис, Йоханнес Ферстад, Йоханнес Хайдеке, Джон Фишбейн, Джон Окун, Джонатан Гордон, Йоост Хёйзинга, Джос Крайжевелд, Джозеф Мо, Джош Лоусон, Джош Тобин, Цзюньхуа Мао, Кай Чен, Кай Хаяши, Каран Сингхал, Карина Нгуен, Кэти Ши, Келли Стирман, Кенджи Хата, Кенни Нгуен, Керен Гу-Лемберг, Кевин Гладстон, Кевин Кинг, Кевин Лю, Кевин Лу, Кевин Парк, Кевин Стоун, Кевин Вейл, Кевин Уиннери, Кевин Ю, Коте Мушегиани, Кристен Инг, Кристиан Георгиев, Кшитий Гупта, Кайл Косич, Лама Ахмад, Ларри Лв, Лорен Итоу, Лорен Янг, Ли Байрон, Лео Чен, Лео Лю, Леон Максин, Лейтон Хо, Ли Цзин, Лян Сюн, Линь Ян, Линден Ли, Лоренц Кун, Луи Фоврье, Лу Чжан, Лукаш Кайзер, Махмуд Эриби, Майя Трембач, Манас Джоглекар, Маноли Лиодакис, Манука Стратта, Марк Чен, Марк Хадналл, Марк Сан, Марк Ванг, Мартин Ли, Марвин Чжан, Матеуш Литвин, Мэтт Джонс, Мэтт Лим, Макс Джонсон, Макс Шварцер, Майанк Гупта, Меган Шах, Мэнцин Ван, Мэнъюань Ян, Миа Глейзе, Майкл Болин, Майкл Лампе, Майкл Малек, Майкл Шарман, Майкл Чжан, Мишель Ванг, Мишель Покрасс, Мигель Оом Темудо де Кастро, Михай Флориан, Майк МакКлэй, Майк Трпчич, Мики Хабрин, Майлз Ванг, Минг Чен, Минсюань Ванг, Минния Фэн, Митчелл Гордон, Мо Бавариан, Мостафа Роханинежад, Начо Сото, Накул Ханна, Нат МакАлис, Натали Стаудахер, Натан ЛаФонтейн, Нил Аджарапу, Ник Фелт, Ник Терли, Никил Панча, Никита Михайлин, Нико Феликс, Никундж Ханда, Нинг Лю, Нишант Рай, Ноа Йоргенсен, Ноам Браун, Олег Бойко, Олег Мурк, Оливия Уоткинс, Оливье Годеман, Уна Глисон, Пол Эшборн, Павел Белов, Питер Флокхарт, Питер Хёшеле, Питер Жохов, Филип Пронин, Филлип Го, Фиби Тэкер, Прафулла Дхаривал, Прашант Р, Рэйчел Диас, Рахул Арора, Раджкумар Самуэль, Расмус Ригаард, Рави Теджа Муллапуди, Рэймонд Ли, Раз Гаон, Реа Мияра, Рейичиро Накано, Реймар Лейке, Ренни Сонг, Ритм Гарг, Р.Дж. Марсан, Роберт Сьонг, Робин Браун, Роман Циупа, Руи Шу, Руслан Нигматуллин, Саачи Джайн, Саагар Патель, Сэм Альтман, Сэм Тойзер, Сэм Тойер, Самир Ахмед, Самуэль Мисерендино, Самуэль Волрих, Сандхини Агарвал, Сантьяго Эрнандес, Сара Донг, Саванна Хон, Скотт Этерсмит, Скотт Майер МакКинни, Шон Фицджеральд, Север Банесиу, Шамез Хемани, Шэнцзя Чжао, Шэнли Ху, Шибани Сантуркар, Шреяс Кришнасвами, Шучао Би, Шунью Яо, Шуюань Чжан, Симон Посада Фишман, Спенсер Папай, Спаг Голден, Сринивас Нараянан, Стэнли Сие, Стивен Логсдон, Сундип Тирумаларедди, Тал Страмер, Тао Ванг, Тао Син, Тейлор Гордон, Теджал Патвардхан, Тибо Соттио, Тина Срискандараджа, Тони Каспарро, Тони Чжао, Тревор Крич, Узайр Навид Ифтихар, Валери Ци, Винит Косараджу, Вишал Куо, Витчир Понг, Вивек Верма, Влад Петров, Вэнда Чжоу, Вэньлэй Се, Вэньтин Чжан, Уилл ДеПю, Уилл Эллсворт, Уильям Шеу, Уайатт Томпсон, Ямин Лин, Ян Дюбуа, Яодун Ю, Яра Хакбаз, Яш Патил, Ифан Ву, Илун Цин, Инин Чэнь, Ируй Чжан, Ё Шавит, Ён Ча, Юньюнь Ван, Юши Ван, Зак Султан, Цзэхао Доу, Цзэвэй Чу, Чжэн Шао, Чжиган Ван, Чжишуай Чжан, Цзыхао Чжан