Liwati menyang isi utama
OpenAI

Majokake sains lan matematika nganggo GPT‑5.2

GPT‑5.2 iku model paling kuwat sing tau ana kanggo karya matematika lan sains.

Lagi dimuat…

Salah siji pangarep-arep kita marang AI sing kuwat yaiku AI bakal nyepetake riset ilmiah kanggo mupangati kabeh wong, mbantu para peneliti njelajah luwih akeh gagasan, nguji luwih cepet, lan ngowahi panemuan dadi dampak. 

Sajrone setaun kepungkur, kita wis kerja bareng raket karo para ilmuwan ing bidang matematika, fisika, biologi, lan ilmu komputer kanggo mangerteni ing ngendi AI bisa mbantu—lan ing ngendi isih ana kekurangane. Sasi kepungkur, kita nerbitake makalah sing nglumpukake studi kasus awal ing matematika, fisika, biologi, ilmu komputer, astronomi, lan ilmu material, nalika GPT‑5 mbantu para peneliti, nuduhake carane GPT‑5 wis wiwit nyumbang ing karya ilmiah nyata. Kanthi GPT‑5.2, kita wiwit ndeleng asil kasebut dadi luwih konsisten lan luwih bisa dipercaya.

Kinerja luwih kuwat nalika presisi penting

GPT‑5.2 Pro lan GPT‑5.2 Thinking yaiku model paling kuwat sing tau ana kanggo karya ilmiah lan matematika.

Nalar matematika sing kuwat iku dhasar kanggo keandalan ing karya ilmiah lan teknis. Iki ndadekake model bisa ngetutake logika multi-langkah, njaga kuantitas tetep konsisten, lan ngindhari kesalahan alus sing bisa akumulasi ing analisis nyata—wiwit saka simulasi lan statistik nganti prakiraan lan pemodelan. Dandan ing benchmark kaya FrontierMath nggambarake dudu mung katrampilan sing sempit, nanging nalar umum lan abstraksi sing luwih kuwat, kapabilitas sing langsung migunani kanggo alur kerja ilmiah kayata coding, analisis data, lan rancangan eksperimen.

Kapabilitas iki uga raket gegayutan karo kemajuan tumuju inteligensi umum. Sistem sing bisa nalar kanthi andal liwat abstraksi, njaga konsistensi ing rantai pikiran sing dawa, lan nggeneralisasi lintas domain nuduhake sipat dhasar tumrap AGI—dudu trik khusus tugas, nanging katrampilan nalar sing jembar lan bisa ditransfer sing penting ing sains, rekayasa, lan pangambilan keputusan ing donya nyata.

Kita yakin GPT‑5.2 Pro lan GPT‑5.2 Thinking yaiku model paling apik ing donya kanggo mbantu lan nyepetake karya ilmuwan. Ing GPQA Diamond, benchmark tanya-jawab tingkat pascasarjana sing tahan marang telusur Google, GPT‑5.2 Pro nggayuh 93.2%, lan GPT‑5.2 Thinking nyusul cedhak ing 92.4%.

Ing GPQA Diamond(mbukak ing jendhela anyar), model njawab pitakonan pilihan ganda babagan fisika, kimia, lan biologi. Ora ana piranti sing diaktifake lan upaya nalar disetel maksimal.

Ing FrontierMath (Tier 1–3), evaluasi matematika tingkat ahli, GPT‑5.2 Thinking nggawe state of the art anyar kanthi ngrampungake 40.3% masalah.

Ing FrontierMath(mbukak ing jendhela anyar), model ngrampungake masalah matematika tingkat ahli. Piranti Python diaktifake lan upaya nalar disetel maksimal.

Studi kasus

GPT‑5.2 is not only strong at graduate-level science problems. We now regularly see our frontier models contributing solutions to previously unsolved—and increasingly subtle—questions in mathematics and the sciences.

In this case study, we describe how GPT‑5.2 Pro helped resolve an open research problem in statistical learning theory, documented in a new paper, On Learning-Curve Monotonicity for Maximum Likelihood Estimators(mbukak ing jendhela anyar).

The question (“If you collect more data, do your results reliably get better?”) shows up any time you fit a model from data. You can draw a learning curve that tracks average error as you add more examples. In the best case, the curve is monotone. More data means less error, every step of the way. That is the behavior people hope for, and often assume.

But over the last few years, researchers have learned that this intuition can fail. A line of work kicked off by an open problem posed at the Conference on Learning Theory (COLT) in 2019 by Viering, Mey, and Loog showed that the answer is often no. Even very simple, well-behaved toy setups can have non-monotonic learning curves, where adding data increases expected error. That surprise triggered a wave of follow-up papers. They expanded the list of settings where these reversals happen and proposed increasingly elaborate methods designed to restore monotone behavior.

Still, one of the most basic cases remained unresolved. What happens in the cleanest textbook situation, where the statistical model is actually correct and the data follow the familiar bell curve pattern, with a known mean but unknown standard deviation? Researchers already knew that small changes to this setup could break monotonic behavior. But the answer remained unknown in this core case.

Our new paper demonstrates that in this clean setting, intuition prevails: learning is predictably improved by more data, rather than behaving in surprising or unstable ways. What makes this paper unusual is how the proof was obtained. The authors did not work out a strategy and then ask the model to fill in steps. They did not provide intermediate arguments or a proof outline. Instead, they asked GPT‑5.2 Pro to solve the open problem directly, and then carefully verified the proof, including review and validation by external subject-matter experts.

The authors then asked simple follow-up questions to see how far the idea could go. GPT‑5.2 Pro extended the result beyond the original problem to higher dimensional settings and other common statistical models. Throughout, the human role stayed focused on verification and clear writing, rather than supplying mathematical scaffolding.

Ndelok menyang ngarep

Asil iki nuduhake arah sing migunani babagan carane sistem AI bisa ndhukung riset ilmiah, mligine ing domain sing nduweni pondasi teoretis aksiomatik kaya matematika lan ilmu komputer teoretis. Ing kahanan kaya mangkene, model tercanggih bisa mbantu njelajah bukti, nguji hipotesis, lan ngenali sambungan sing bisa wae mbutuhake upaya manungsa sing gedhe yen digoleki dhewe.

Ing wektu sing padha, sistem iki dudu peneliti mandiri. Pertimbangan ahli, verifikasi, lan pangerten domain tetep wigati. Sanadyan model sing banget mumpuni bisa wae nggawe kesalahan utawa gumantung marang asumsi sing ora diandharake. Nanging model uga bisa ngasilake argumentasi rinci lan terstruktur sing pantes ditliti lan disampurnakake kanthi tliti dening manungsa. Mula, nggayuh kemajuan sing andal nganggo AI gumantung marang alur kerja sing njaga validasi, transparansi, lan kolaborasi tetep ana ing proses.

Yen dideleng minangka studi kasus, asil iki nggambarake mode praktik riset sing lagi tuwuh. Model kaya GPT‑5.2 bisa dadi piranti kanggo ndhukung nalar matematika lan nyepetake eksplorasi tahap awal, dene tanggung jawab kanggo kabeneran, interpretasi, lan konteks tetep ana ing tangan para peneliti manungsa. Yen digunakake kanthi ati-ati, sistem kaya iki bisa mbantu nyederhanakake aspek-aspek penting saka karya teoretis tanpa nggeser peran utama pertimbangan manungsa ing penyelidikan ilmiah.