U bood nuxurka ugu muhiimsan
OpenAI

Diseembar 11, 2025

DaabacaaddaBadeecadShirkad

Horumarinta sayniska iyo xisaabta iyadoo la adeegsanayo GPT‑5.2

GPT‑5.2 waa noockeenna ugu awoodda badan ilaa hadda ee shaqada xisaabta iyo sayniska.

Soo kacaya…

Mid ka mid ah rajadeenna AI xooggan waa in ay dedejiso cilmi-baarista sayniska si ay qof walba uga faa’iideyso, iyadoo ka caawinaysa cilmi-baarayaasha inay sahamiyaan fikrado badan, si dhakhso leh u tijaabiyaan, oo ay u beddelaan daahfurrada saameyn.

Sanadkii la soo dhaafay, waxaan si dhow ula shaqaynaynay saynisyahanno ku kala sugan xisaabta, fiisigiska, bayoolajiga, iyo cilmiga kombuyuutarka si aan u fahamno halka AI ka caawin karto—iyo halka ay weli ka gaaban tahay. Bishii hore, waxaan daabacnay warqad cilmiyeed oo ururinaysa daraasado kiis oo hore oo ku kala baahsan xisaabta, fiisigiska, bayoolajiga, cilmiga kombuyuutarka, cilmiga xiddigaha, iyo cilmiga agabka, kuwaas oo GPT‑5 ka caawisay cilmi-baarayaasha, taasoo muujinaysa sida GPT‑5 uu durba uga bilaabay inuu ka qayb qaato shaqo saynis oo dhab ah. Iyadoo la adeegsanayo GPT‑5.2, waxaan bilaabaynaa inaan aragno faa’iidooyinkaas oo noqda kuwo ka joogto badan oo lagu kalsoonaan karo.

Waxqabad ka xooggan halka saxnidu muhiimka tahay

GPT‑5.2 Pro iyo GPT‑5.2 Thinking waa noocyadeenna ugu awoodda badan ilaa hadda ee shaqada sayniska iyo xisaabta.

Caqliyeynta xisaabeed ee xooggan waa aasaaska lagu kalsoonaan karo ee shaqada sayniska iyo farsamada. Waxay u suurtogelisaa noocyada inay raacaan macquul tallaabooyin badan leh, inay tirada si isku mid ah u hayaan, oo ay ka fogaadaan khaladaad yar-yar oo ku bata falanqaynta dhabta ah—laga bilaabo jilitaannada iyo tirakoobka ilaa saadaasha iyo sameynta noocyo. Horumarrada bartilmaameedyada sida FrontierMath ma muujinayaan xirfad cidhiidhi ah oo keliya, balse waxay muujinayaan caqliyeyn guud iyo soo-koobid ka xooggan, awoodahaas oo si toos ah ugu gudba hab-socodyada sayniska sida qorista koodhka, falanqaynta xogta, iyo naqshadaynta tijaabooyinka.

Awoodahani sidoo kale si dhow ayay ugu xidhan yihiin horumarka loo socdo garaadka guud. Nidaam si lagu kalsoonaan karo uga shaqayn kara soo-koobid, joogteyn kara is-waafajin silsilado fikir oo dheer, kana guudayn kara meelo kala duwan, wuxuu muujinayaa astaamo aasaas u ah AGI—ma aha xeelado u gaar ah hawl keliya, balse waa xirfado caqliyeyn oo ballaadhan oo la wareejin karo, kuwaas oo muhiim ka ah sayniska, injineernimada, iyo go’aan-qaadashada dunida dhabta ah.

Waxaan aaminsanahay in GPT‑5.2 Pro iyo GPT‑5.2 Thinking ay yihiin noocyada ugu fiican dunida ee ka caawinta iyo dedejinta saynisyahannada. Marka laga eego GPQA Diamond, oo ah bartilmaameed su’aalo iyo jawaabo heer jaamacad sare ah oo Google-proof ah, GPT‑5.2 Pro wuxuu gaadhay 93.2%, halka GPT‑5.2 Thinking uu si dhow uga dambeeyo 92.4%.

Gudaha GPQA Diamond(ku furmaa daaqad cusub), noocyadu waxay ka jawaabaan su’aalo doorashooyin badan leh oo ku saabsan fiisigiska, kimistariga, iyo bayoolajiga. Wax qalab ah lama hawlgelin, dadaalka caqliyeyntana waxaa loo dejiyay ugu sarreyn.

Marka laga eego FrontierMath (Tier 1–3), oo ah qiimayn xisaab heer khabiir ah, GPT‑5.2 Thinking wuxuu dejiyay heer cusub oo ugu sarreeya, isagoo xalliyay 40.3% dhibaatooyinka.

Gudaha FrontierMath(ku furmaa daaqad cusub), noocyadu waxay xalliyaan dhibaatooyin xisaabeed oo heer khabiir ah. Qalab Python ah waa la hawlgeliyay, dadaalka caqliyeyntana waxaa loo dejiyay ugu sarreyn.

Daraasad kiis

GPT‑5.2 is not only strong at graduate-level science problems. We now regularly see our frontier models contributing solutions to previously unsolved—and increasingly subtle—questions in mathematics and the sciences.

In this case study, we describe how GPT‑5.2 Pro helped resolve an open research problem in statistical learning theory, documented in a new paper, On Learning-Curve Monotonicity for Maximum Likelihood Estimators(ku furmaa daaqad cusub).

The question (“If you collect more data, do your results reliably get better?”) shows up any time you fit a model from data. You can draw a learning curve that tracks average error as you add more examples. In the best case, the curve is monotone. More data means less error, every step of the way. That is the behavior people hope for, and often assume.

But over the last few years, researchers have learned that this intuition can fail. A line of work kicked off by an open problem posed at the Conference on Learning Theory (COLT) in 2019 by Viering, Mey, and Loog showed that the answer is often no. Even very simple, well-behaved toy setups can have non-monotonic learning curves, where adding data increases expected error. That surprise triggered a wave of follow-up papers. They expanded the list of settings where these reversals happen and proposed increasingly elaborate methods designed to restore monotone behavior.

Still, one of the most basic cases remained unresolved. What happens in the cleanest textbook situation, where the statistical model is actually correct and the data follow the familiar bell curve pattern, with a known mean but unknown standard deviation? Researchers already knew that small changes to this setup could break monotonic behavior. But the answer remained unknown in this core case.

Our new paper demonstrates that in this clean setting, intuition prevails: learning is predictably improved by more data, rather than behaving in surprising or unstable ways. What makes this paper unusual is how the proof was obtained. The authors did not work out a strategy and then ask the model to fill in steps. They did not provide intermediate arguments or a proof outline. Instead, they asked GPT‑5.2 Pro to solve the open problem directly, and then carefully verified the proof, including review and validation by external subject-matter experts.

The authors then asked simple follow-up questions to see how far the idea could go. GPT‑5.2 Pro extended the result beyond the original problem to higher dimensional settings and other common statistical models. Throughout, the human role stayed focused on verification and clear writing, rather than supplying mathematical scaffolding.

Hore u eegis

Natiijadani waxay tilmaamaysaa jiho faa’iido leh oo ku saabsan sida nidaamyada AI ay u taageeri karaan cilmi-baarista sayniska, gaar ahaan meelaha leh aasaasyo aragtiyeed oo axiom ah sida xisaabta iyo cilmiga kombuyuutarka aragtiyeed. Goobo sidan oo kale ah, noocyada ugu casriyeysan waxay ka caawin karaan sahaminta caddeymaha, tijaabinta mala-awaallada, iyo aqoonsiga xiriirrada haddii kale u baahan lahaa dadaal badan oo bini’aadanimo si loo ogaado.

Isla mar ahaantaana, nidaamyadani ma aha cilmi-baarayaal madaxbannaan. Xukunka khabiirka, xaqiijinta, iyo fahamka mawduuca ayaa weli lama huraan ah. Xitaa noocyada aadka u awoodda badan way khaldami karaan ama waxay ku tiirsanaan karaan mala-awaallo aan la sheegin. Laakiin sidoo kale waxay soo saari karaan dooddo faahfaahsan oo habaysan oo mudan in si taxaddar leh bini’aadamku u barto oo u sii hagaajiyo. Sidaas darteed, samaynta horumar lagu kalsoonaan karo oo AI lala sameeyo waxay ku xiran tahay hab-socodyo xaqiijinta, daahfurnaanta, iyo iskaashiga si adag ugu haya wareegga.

Marka loo eego daraasad kiis, natiijadani waxay muujinaysaa qaab cilmi-baaris oo soo baxaya. Noocyo sida GPT‑5.2 waxay u adeegi karaan qalab lagu taageero caqliyeynta xisaabeed laguna dedejiyo sahamin marxalad hore ah, halka mas’uuliyadda saxnaanta, fasiraadda, iyo macnaha guud ay weli saaran tahay cilmi-baarayaasha bini’aadamka. Haddii si taxaddar leh loo isticmaalo, nidaamyadan oo kale waxay ka caawin karaan fududeynta qaybo muhiim ah oo shaqada aragtiyeed ah iyagoo aan meesha ka saarayn doorka dhexe ee xukunka bini’aadamka ee baaritaanka sayniska.