Soo bandhigida GPT‑5 ee horumariyeyaasha
Nooca ugu fiican ee koodhka iyo hawlaha wakiilnimada.
Maanta, waxaan ku sii deynaynaa GPT‑5 madalkeenna API—nooceeda ugu fiican illaa hadda ee koodhka iyo hawlaha wakiilnimada.
GPT‑5 waa heerkii ugu sarreeyay (SOTA) ee halbeegyada muhiimka ah ee koodhka, isagoo helay 74.9% SWE-bench Verified iyo 88% Aider polyglot. Waxaan u tababarnay GPT‑5 inuu noqdo la-shaqeeye dhab ah oo koodhin ah. Wuxuu aad ugu fiican yahay soo saarista koodh tayo sare leh iyo qabashada hawlo sida sixidda bug-yada, tafatirka koodhka, iyo ka jawaabista su’aalaha ku saabsan kaydadka koodhka ee adag. Noocu waa la hagi karaa waana wada-shaqeyn—wuxuu si sax ah u raaci karaa tilmaamo aad u faahfaahsan, wuxuuna bixin karaa sharraxaado horudhac ah oo ku saabsan ficilladiisa ka hor iyo inta u dhexeysa wicitaannada qalabka. Noocu sidoo kale aad buu ugu fiican yahay frontend coding, isagoo 70% ka adkaada OpenAI o3 ee horumarinta webka frontend-ka marka la eego tijaabooyinkeenna gudaha.
Waxaan ku tababarnay GPT‑5 hawlo koodhin oo dunida dhabta ah ah annagoo kaashanayna tijaabiyeyaal hore oo ka socda startup-yo iyo shirkado waaweyn. Cursor wuxuu sheegay GPT‑5 inuu yahay “nooca ugu caqliga badan ee [ay] isticmaaleen” isla markaana “si layaab leh u caqli badan, sahlan in la hago, xitaa leh shakhsiyad aan [ay] ku arag noocyo kale.” Windsurf wuxuu wadaagay in GPT‑5 uu yahay SOTA marka loo eego qiimeyntooda isla markaana “uu leeyahay kala bar heerka qaladka xusida function-ka marka loo eego noocyada kale ee ugu casriyeysan.” Vercel wuxuu yiri “waa nooca AI frontend-ka ugu fiican, isagoo gaaraya waxqabadka ugu sarreeya labadaba dareenka bilicsanaanta iyo tayada koodhka, taasoo gelinaysa fasal u gaar ah.”
GPT‑5 sidoo kale aad buu ugu fiican yahay hawlaha wakiilnimo ee soconaya waqti dheer—wuxuu gaaray natiijooyin SOTA ah τ2-bench telecom (96.7%), oo ah halbeeg xusida function-ka la sii daayay 2 bilood ka hor. Garaadka qalab ee GPT‑5 ee la hagaajiyay wuxuu u oggolaanayaa inuu si kalsooni leh isugu xiro daraasiin wicitaanno qalab ah—isku xigxig iyo barbar socodba—iyadoo aanuu jidka lumin, taasoo ka dhigaysa mid aad uga fiican fulinta hawlo adag oo dunida dhabta ah dhammaad-ilaa-dhammaad. Waxa kale oo uu si sax ah u raacaa tilmaamaha qalabka, uga fiican yahay maaraynta khaladaadka qalabka, wuxuuna aad ugu fiican yahay soo celinta nuxurka macnaha dheer. Manus wuxuu yiri GPT‑5 “wuxuu gaaray waxqabadkii ugu fiicnaa ee [ay] abid ka arkaan hal nooc marka loo eego halbeegyadooda gudaha.” Notion wuxuu yiri “jawaabaha degdegga ah ee [nooca], gaar ahaan habka caqliyeyn hoose, waxay GPT‑5 ka dhigaan nooc ku habboon markaad u baahan tahay hawlo adag in hal mar lagu xalliyo.” Inditex wuxuu wadaagay “waxa dhab ahaan [GPT‑5] uga soocaya waa qotada caqliyeyntiisa: jawaabo xeel dheer oo lakabyo badan leh oo ka tarjumaya faham dhab ah oo mawduuca ah.”
Waxaan API-geenna ku soo bandhigaynaa astaamo cusub si aan horumariyeyaasha u siino xakameyn dheeraad ah oo ku saabsan jawaabaha nooca. GPT‑5 wuxuu taageeraa doorsoome API cusub oo verbosity ah (qiimayaal: low, medium, high) si ay uga caawiso xakamaynta in jawaabuhu noqdaan gaagaaban oo toos ah ama dheer oo dhammaystiran. Doorsoomaha reasoning_effort ee GPT‑5 hadda wuxuu qaadan karaa qiime minimal ah si jawaabaha dib loogu helo si ka dhakhso badan, iyadoo aan marka hore la samayn caqliyeyn ballaaran. Waxaan sidoo kale ku darnay nooc qalab cusub—custom tools—si GPT‑5 ugu waco qalab qoraal caadi ah halkii JSON. Custom tools-ku waxay taageeraan xaddididda iyadoo la adeegsanayo context-free grammars oo uu bixiyo horumariyuhu.
Waxaan GPT‑5 ku sii deynaynaa saddex cabbir gudaha API-ga—gpt-5, gpt-5-mini, iyo gpt-5-nano—si aan horumariyeyaasha u siino dabacsanaan dheeraad ah oo u dhexeeya waxqabadka, qiimaha, iyo dib-u-dhaca. Halka GPT‑5 ee ChatGPT uu yahay nidaam ka kooban noocyo caqliyeyn, aan-caqliyeyn, iyo router ah, GPT‑5 ee madasha API-ga waa nooca caqliyeynta ee awoodda ugu badan siiya ChatGPT. Si gaar ah, GPT‑5 oo leh caqliyeyn minimal ah waa nooc ka duwan nooca aan-caqliyeynta ahayn ee ChatGPT, waxaana si ka fiican loogu habeeyay horumariyeyaasha. Nooca aan-caqliyeynta ahayn ee lagu isticmaalo ChatGPT waxaa loo heli karaa sida gpt-5-chat-latest.
Si aad u akhrido wax ku saabsan GPT‑5 gudaha ChatGPT, oo aad wax badan uga barato horumarinta kale ee ChatGPT, eeg bloggeenna cilmi-baarista. Faahfaahin dheeraad ah oo ku saabsan sida shirkaduhu ugu faraxsan yihiin isticmaalka GPT‑5, eeg bloggeenna shirkadaha.
GPT‑5 waa nooca koodhka ugu xooggan ee aan abid sii deynay. Wuxuu ka sarreeyaa o3 dhammaan halbeegyada koodhka iyo adeegsiyada dunida dhabta ah, waxaana si gaar ah loogu hagaajiyay inuu ka iftiimo alaabooyinka koodhka wakiilnimada leh sida Cursor, Windsurf, GitHub Copilot, iyo Codex CLI. GPT‑5 wuxuu aad uga yaabiyay tijaabiyeyaasheenna alpha, isagoo jabiyay rikoorro badan oo qiimeynadooda gudaha ee gaarka ah ah.
Jawaab-celin hore oo ku saabsan GPT‑5 ee hawlaha koodhka ee dunida dhabta ah
“GPT-5 waa nooca koodh-qorista ugu caqliga badan ee aan isticmaalnay. Kooxdayadu waxay ogaatay in GPT-5 uu yahay mid si layaab leh u caqli badan, si fudud loo hago, xitaa lehna shakhsiyad aynaan ku arag nooc kale. Kaliya ma qabto bug-yo khiyaano badan oo si qoto dheer u qarsoon, balse wuxuu sidoo kale wadi karaa wakiillo asalka dambe ah oo dheer oo wareegyo badan leh si ay hawlo adag u gaarsiiyaan dhammaad—kuwaas oo ahaa dhibaatooyin noocyadii kale ku xannibmi jireen. Wuxuu noqday xulashadayada maalinlaha ah ee wax kasta laga bilaabo qeexidda iyo qorsheynta PR ilaa dhammeystirka build-yo dhammaad-ilaa-dhammaad ah.”
SWE-bench Verified, oo ah qiimeyn ku saleysan hawlaha injineernimada software-ka ee dunida dhabta ah, GPT‑5 wuxuu helay 74.9%, halka o3 uu ahaa 69.1%. Si gaar ah, GPT‑5 wuxuu ku gaaraa dhibcahan sare wax-ku-oolnimo iyo xawaare ka badan: marka loo eego o3 oo leh dadaal caqliyeyn sare, GPT‑5 wuxuu adeegsadaa 22% token-yo soo saarid ah oo ka yar iyo 45% wicitaanno qalab ah oo ka yar.
Gudaha SWE-bench Verified, nooca waxaa la siiyaa keydka koodhka iyo sharaxaadda arrinta, waana inuu abuuraa patch lagu xalliyo arrinta. Calaamadaha qoraalku waxay muujinayaan dadaalka caqliyeynta. Dhibcahayagu waxay ka reebaan 23 ka mid ah 500 dhibaato oo xalkoodu uusan si kalsooni leh uga gudbin kaabayaashayada. GPT‑5 waxaa la siiyay weydiin gaaban oo adkaynaysay in si qoto dheer loo xaqiijiyo xalalka; isla weydiintaas wax faa’iido ah uma yeelin o3.
Aider polyglot, oo ah qiimeyn tafatirka koodhka, GPT‑5 wuxuu dhigay rikoor cusub oo ah 88%, taasoo ah hoos u dhac saddex meelood meel ah oo ku yimid heerka qaladka marka loo eego o3.
Gudaha Aider polygot(ku furmaa daaqad cusub) (diff), nooc ayaa la siiyaa layli koodhin ah oo ka yimid Exercism waana inuu xalkiisa u qoro sida code diff. Noocyada caqliyeynta waxa lagu socodsiiyay dadaal caqliyeyn oo sarreeya.
Waxaan sidoo kale ogaanay in GPT‑5 uu aad ugu fiican yahay inuu si qoto dheer u galo kaydadka koodhka si uu uga jawaabo su’aalaha ku saabsan sida qaybo kala duwan u shaqeeyaan ama isula falgalaan. Kayd koodh oo u adag sida keydka waxbarashadda xoojinta ah ee OpenAI, waxaan aragnaa in GPT‑5 uu naga caawin karo inaan ka caqliyeeyno una jawaabno su’aalaha ku saabsan koodhkeenna, taasoo dedejineysa shaqadeenna maalinlaha ah.
Marka uu soo saarayo koodh frontend oo loogu talagalay barnaamijyada webka, GPT‑5 waa mid si badan u daneeya bilicsanaanta, hammi badan, oo sax ah. Isbarbardhigyo dhinac-dhinac ah oo lala sameeyay o3, GPT‑5 waxaa doorbiday tijaabiyeyaasheenna 70% waqtiga.
Halkan waxaa ku yaal tusaalooyin madadaalo leh oo si gaar ah loo soo xulay oo muujinaya waxa GPT‑5 ku qaban karo hal weydiin oo keliya:
Weydiin: Fadlan samee bog degitaan qurux badan oo dhab u eg oo loogu talagalay adeeg siinaya qofka aadka u jecel qaxwada rukunsi $200/bishii ah, kaas oo bixiya kiro qalab iyo tababar ku saabsan dubista qaxwada iyo sameynta espresso-ga ugu fiican. Dadka bartilmaameedka ah waa qof da’ dhexaad ah oo deggan bay area, laga yaabo inuu ka shaqeeyo tignoolajiyada, waxbartay, haysta dakhli dheeraad ah, kuna xiiseeya farshaxanka iyo cilmiga qaxwada. U hagaaji si uu u kordhiyo isdiiwaangelinta 6 bilood ah.
Ka eeg tusaalooyin kale oo GPT‑5 ah gallery-gayaga halkan(ku furmaa daaqad cusub).
GPT‑5 waa la-shaqeeye ka wanaagsan, gaar ahaan alaabooyinka koodhka wakiilnimada leh sida Cursor, Windsurf, GitHub Copilot, iyo Codex CLI. Inta uu shaqeynayo, GPT‑5 wuxuu soo saari karaa qorshayaal, cusboonaysiin, iyo soo-koobidyo u dhexeeya wicitaannada qalabka. Marka loo eego noocyadeennii hore, GPT‑5 waa mid firfircoon oo ka badan marka ay timaaddo dhammeystirka hawlo hammi leh, isaga oo aan joogsan si uu u sugo oggolaanshahaaga ama uga niyad jabin kakanaanta sare.
Halkan waxaa ku yaal tusaale muujinaya sida GPT‑5 u ekaan karo marka uu wajahayo hawl adag (kiiskan, sameynta website makhaayad):
Ka dib marka isticmaaluhu codsado website makhaayaddiisa ah, GPT‑5 wuxuu wadaagaa qorshe degdeg ah, wuxuu diyaariyaa qaab-dhismeedka app-ka, rakibaa ku-tiirsanaanta, abuuraa nuxurka bogga, socodsiiya build si uu u hubiyo khaladaadka ururinta, soo koobaa shaqadiisa, wuxuuna soo jeediyaa tallaabooyinka xiga ee suurtagalka ah. Fiidyowgan waxaa la dedejiyay qiyaastii ~3x si lagaa badbaadiyo sugitaanka; muddada buuxda ee lagu sameeyay website-ku waxay ahayd ku dhowaad saddex daqiiqo.
Ka sokow agentic coding, GPT‑5 guud ahaan aad buu ugu fiican yahay hawlaha wakiilnimada. GPT‑5 wuxuu dhigay rikoorro cusub halbeegyada raacitaanka tilmaamaha (69.6% Scale MultiChallenge, sida uu qiimeeyay o3‑mini) iyo xusida function-ka (96.7% τ2-bench telecom). Garaadka qalabka ee la hagaajiyay wuxuu u oggolaanayaa GPT‑5 inuu si kalsooni leh isugu xiro ficillo si uu u qabto hawlo dunida dhabta ah.
Jawaab-celin hore oo ku saabsan GPT‑5 ee hawlaha wakiilnimada
“GPT-5 waa tallaabo weyn oo hore loo qaaday. Wuxuu gaaray waxqabadkii ugu fiicnaa ee aan abid ka aragno hal nooc marka loo eego halbeegyadayada gudaha. GPT-5 wuxuu ku fiicnaaday hawlo kala duwan oo wakiilnimo leh—xitaa ka hor inta aanan wax ka beddelin hal sadar oo koodh ah ama aan habeyn weydiin. Hordhacyada cusub iyo xakameyn sax ah oo dheeraad ah oo ku saabsan isticmaalka qalabka ayaa suurtageliyay bood weyn oo ku yimid xasilloonida iyo sida loo hago wakiilladayada.”
GPT‑5 wuxuu u raacaa tilmaamaha si ka kalsooni badan dhammaan kuwii ka horreeyay, isagoo dhibco sare ka keenay COLLIE, Scale MultiChallenge, iyo qiimeynteenna gudaha ee raacitaanka tilmaamaha.
Gudaha COLLIE(ku furmaa daaqad cusub), noocyadu waa inay qoraan qoraal buuxiya xaddidaamo kala duwan. Gudaha Scale MultiChallenge(ku furmaa daaqad cusub), noocyada waxaa lagu tijaabiyaa wada-sheekeysi wareegyo badan leh si ay si sax ah ugu adeegsadaan afar nooc oo macluumaad ah farriimihii hore. Dhibcahayagu waxay ka yimaadeen isticmaalka o3‑mini sida qiimeeye, kaas oo ka saxsanaa GPT‑4o. Qiimeyntayada gudaha ee OpenAI API ee raacitaanka tilmaamaha, noocyadu waa inay raacaan tilmaamo adag oo laga soo qaatay jawaab-celinta horumariyeyaasha dhabta ah. Noocyada caqliyeynta waxa lagu socodsiiyay dadaal caqliyeyn oo sarreeya.
Waxaan si adag uga shaqaynay hagaajinta xusida function-ka qaababka muhiimka u ah horumariyeyaasha. GPT‑5 wuxuu ka fiican yahay raacitaanka tilmaamaha qalabka, ka fiican yahay la tacaalidda khaladaadka qalabka, sidoo kale ka fiican yahay sameynta si firfircoon wicitaanno qalab badan oo isku xigxiga ama barbar socda. Marka la faro, GPT‑5 wuxuu sidoo kale soo saari karaa farriimo horudhac ah ka hor iyo inta u dhexeysa wicitaannada qalabka si uu dadka isticmaala ugu cusbooneysiiyo horumarka intii ay socdeen hawlaha wakiilnimada ee dheer.
Laba bilood ka hor, τ2-bench telecom waxaa daabacday Sierra.ai isagoo ah halbeeg adag oo isticmaalka qalabka ah kaas oo muujiyay sida waxqabadka qaabka luuqaddu si weyn hoos ugu dhaco marka uu la falgalayo xaalad deegaan oo dadka isticmaalaa wax ka beddeli karaan. Gudaha daabacaaddooda(ku furmaa daaqad cusub), wax nooc ah kama uusan helin ka badan 49%. GPT‑5 wuxuu helay 97%.
Gudaha τ2-bench(ku furmaa daaqad cusub), noocu waa inuu adeegsadaa qalab si uu u dhammaystiro hawl adeeg macaamiil, halkaas oo uu jiri karo isticmaale xiriiri kara isla markaana ficillo ka qaadi kara xaaladda dunida. Noocyada caqliyeynta waxa lagu socodsiiyay dadaal caqliyeyn oo sarreeya.
GPT‑5 sidoo kale wuxuu muujiyaa horumar xooggan oo ku saabsan waxqabadka macnaha dheer. OpenAI-MRCR, oo ah cabbirka soo-celinta macluumaadka macnaha dheer, GPT‑5 wuxuu ka sarreeyaa o3 iyo GPT‑4.1, iyadoo farqigu si weyn u sii kordhayo marka dhererka gelintu bato.
Gudaha OpenAI-MRCR(ku furmaa daaqad cusub) (multi-round co-reference resolution), codsiyo isticmaale oo isku mid ah oo “needle” ah ayaa lagu dhex daraa “haystacks” dheer oo codsiyo iyo jawaabo la mid ah, waxaana nooca laga codsadaa inuu soo saaro jawaabta needle-ka i-th. Mean match ratio wuxuu cabbiraa celceliska iswaafajinta string-ga ee u dhexeeya jawaabta nooca iyo jawaabta saxda ah. Qodobbada ku yaal 256k max input tokens waxay metelaan celcelisyo ka badan 128k–256k input tokens, iyo wixii la mid ah. Halkan, 256k waxay ka dhigan tahay 256 * 1,024 = 262,114 token. Noocyada caqliyeynta waxa lagu socodsiiyay dadaal caqliyeyn oo sarreeya.
Waxaan sidoo kale si furan u daabacaynaa BrowseComp Long Context(ku furmaa daaqad cusub), oo ah halbeeg cusub oo lagu qiimeeyo su’aal-jawaabta macnaha dheer. Halbeeggan, nooca waxaa la siiyaa su’aal isticmaalaha ah, liis dheer oo natiijooyin raadis la xiriira ah, waana inuu su’aasha kaga jawaabaa iyadoo lagu saleynayo natiijooyinka raadinta. Waxaan u naqshadeynay BrowseComp Long Context inuu noqdo mid dhab ah, adag, oo leh jawaabo sax ah oo si kalsooni leh loo hubo. Gelinno ah 128K–256K token, GPT‑5 wuxuu bixiyaa jawaabta saxda ah 89% waqtiga.
Gudaha API-ga, dhammaan noocyada GPT‑5 waxay aqbali karaan ugu badnaan 272,000 input token waxayna soo saari karaan ugu badnaan 128,000 token oo caqliyeyn & output ah, wadar ahaan dherer macne oo ah 400,000 token.
GPT‑5 waa ka kalsooni badan yahay noocyadeennii hore. Weydiimo laga soo qaatay halbeegyada LongFact iyo FactScore, GPT‑5 wuxuu sameeyaa qiyaastii ~80% khaladaad xaqiiqo oo ka yar o3. Tani waxay ka dhigaysaa mid ku habboon adeegsiyada wakiilnimada halka saxnimadu muhiim tahay—gaar ahaan koodhka, xogta, iyo go’aan-qaadashada.
Dhibcaha sare way ka sii xun yihiin. LongFact(ku furmaa daaqad cusub) iyo FActScore(ku furmaa daaqad cusub) waxay ka kooban yihiin su’aalo furan oo xaqiiqo-doon ah. Waxaan isticmaalnaa qiimeeye ku saleysan LLM oo leh browsing si loo xaqiijiyo jawaabaha ku saleysan weydiimaha halbeegyadan iyo in la cabbiro saamiga sheegashooyinka xaqiiqo ahaan khaldan. Faahfaahinta hirgelinta iyo qiimeynta waxaa laga heli karaa kaarka siistamka. Noocyada caqliyeynta waxaa loo adeegsaday dadaal caqliyeyn sare. Search lama hawlgelin.
Guud ahaan, GPT‑5 waxaa loo tababaray inuu si fiican uga warqabo xaddidaadihiisa iyo inuu si ka fiican ula tacaalo waxyaabaha lama filaanka ah. Waxaan sidoo kale GPT‑5 u tababarnay inuu aad uga saxsanaado su’aalaha caafimaadka (wax dheeraad ah ka akhri bloggeenna cilmi-baarista). Sida nooc kasta oo luuqadeed, waxaan kugula talinaynaa inaad xaqiijiso shaqada GPT‑5 marka khatartu sarreyso.
Horumariyayaashu waxay xakamayn karaan waqtiga fikirka GPT‑5 iyagoo adeegsanaya doorsoomaha reasoning_effort ee API-ga. Marka lagu daro qiimayaashii hore—low, medium (default), iyo high—GPT‑5 wuxuu sidoo kale taageeraa minimal, kaas oo yareeya caqliyeynta GPT‑5 si jawaab degdeg ah loo soo celiyo.
Qiimaha sare ee reasoning_effort wuxuu kordhiyaa tayada halka qiimaha hoose uu kordhiyo xawaaraha. Dhammaan hawluhu si isku mid ah ugama faa’iidaystaan caqliyeyn dheeri ah, sidaas darteed waxaan kugula talinaynaa inaad tijaabiso si aad u aragto waxa ugu fiican adeegsiyada aad danaynayso.
Tusaale ahaan, caqliyeyn ka sarreysa low wax yar bay ku kordhisaa soo-celinta macnaha dheer ee fudud, balse waxay ku darsataa dhibco boqolley ah oo badan CharXiv Reasoning(ku furmaa daaqad cusub), oo ah halbeeg caqliyeyn muuqaal ah.
Dadaalka caqliyeynta ee GPT‑5 wuxuu keenaa faa’iidooyin kala duwan hawlo kala duwan. CharXiv Reasoning, GPT‑5 waxaa loo oggolaaday inuu isticmaalo qalab python ah.
Si looga caawiyo hagidda dhererka caadiga ah ee jawaabaha GPT‑5, waxaan soo bandhignay doorsoome API cusub oo verbosity ah, kaas oo qaata qiimayaasha low, medium (default), iyo high. Haddii tilmaamo cad ay ka hor yimaadaan doorsoomayaasha verbosity, tilmaamaha cad ayaa mudnaan leh. Tusaale ahaan, haddii aad GPT‑5 ka codsato “qor maqaal 5 faqrad ah”, jawaabta nooca waa inay mar walba noqotaa 5 faqrad iyadoon loo eegin heerka verbosity-ga (si kastaba ha ahaatee, faqradaha laftoodu way dheeraan karaan ama way gaaban karaan).
Faahfaahin=hoose
Faahfaahin=heer_dhexe
Faahfaahin=sare
Haddii la faro, GPT‑5 wuxuu soo saari doonaa farriimo horudhac ah oo isticmaaluhu arki karo ka hor iyo inta u dhexeysa wicitaannada qalabka. Si ka duwan farriimaha qarsoon ee caqliyeynta, farriimahan muuqda waxay u oggolaanayaan GPT‑5 inuu isticmaalaha la wadaago qorshayaal iyo horumar, taasoo ka caawinaysa isticmaalayaasha dhammaadka inay fahmaan habkiisa iyo ujeeddada ka dambeysa wicitaannada qalabka.
Waxaan soo bandhigaynaa nooc qalab cusub—custom tools—kaas oo u oggolaanaya GPT‑5 inuu qalab ugu yeedho qoraal caadi ah halkii JSON. Si GPT‑5 loogu xaddido inuu raaco qaababka custom tool, horumariyeyaashu waxay bixin karaan regex, ama xitaa context-free grammar(ku furmaa daaqad cusub) si buuxda loo qeexay.
Hore, is-dhexgalkeenna qalabka uu horumariyuhu qeexo wuxuu u baahnaa in loogu yeedho JSON, oo ah qaab guud oo ay adeegsadaan web APIs iyo horumariyeyaashu. Hase yeeshee, soo saarista JSON sax ah waxay u baahan tahay nooca inuu si qumman uga baxo dhammaan calaamadaha xigashada, backslashes, newlines, iyo calaamadaha kale ee xakamaynta. In kasta oo noocyadeennu si fiican loogu tababaray soo saarista JSON, gelinno dhaadheer sida boqolaal sadar oo koodh ah ama warbixin 5 bog ah, suurtagalnimada khalad way sii kacdaa. Iyadoo la adeegsanayo custom tools, GPT‑5 wuxuu u qori karaa gelinnada qalabka qoraal caadi ah, isaga oo aan u baahnayn inuu ka baxo dhammaan calaamadaha u baahan escape-ka.
SWE-bench Verified marka la isticmaalayo custom tools halkii JSON tools, GPT‑5 wuxuu helaa ku dhowaad isla dhibcaha.
GPT‑5 wuxuu hormarinayaa xadka ugu casriyeysan ee amniga waana nooc ka adkaysi badan, la isku halayn karo, oo waxtar leh. GPT‑5 si weyn ayuu uga yar yahay inuu mala-awaal sameeyo marka loo eego noocyadeennii hore, si daacadnimo leh ayuu isticmaalaha ugu gudbiyaa ficilladiisa iyo awoodihiisa, wuxuuna bixiyaa jawaabta ugu faa’iidada badan marka ay suurtagal tahay isaga oo weli ku jira xuduudaha amniga. Wax dheeraad ah waxaad ka akhrisan kartaa bloggeenna cilmi-baarista.
GPT‑5 hadda waxaa laga heli karaa madasha API-ga saddex cabbir: gpt-5, gpt-5-mini, iyo gpt-5-nano. Waxaa laga heli karaa Responses API, API-ga dhammeystirka wada-sheekeysiga, waana nooca caadiga ah ee Codex CLI. Qiimaha GPT‑5 waa $1.25/1M input token iyo $10/1M output token, GPT‑5 mini qiimihiisu waa $0.25/1M input token iyo $2/1M output token, halka GPT‑5 nano uu yahay $0.05/1M input token iyo $0.40/1M output token.
Noocyadani waxay taageeraan doorsoomayaasha API-ga ee reasoning_effort iyo verbosity, iyo sidoo kale custom tools. Waxa kale oo ay taageeraan wicitaanka qalabka ee barbar socda, qalab ku-dhisan (raadinta webka, raadinta faylka, soo saarida sawirka, iyo kuwo kale), astaamaha aasaasiga ah ee API-ga (streaming, soo saaridyo habeysan, iyo kuwo kale), iyo astaamo kharash-badbaadin ah sida prompt caching iyo Batch API.
Nooca aan-caqliyeynta ahayn ee GPT‑5 ee lagu isticmaalo ChatGPT waxaa gudaha API-ga looga heli karaa sida gpt-5-chat-latest, sidoo kale qiimihiisu waa $1.25/1M input token iyo $10/1M output token.
GPT‑5 sidoo kale waxaa laga daahfurayaa dhammaan madallada Microsoft, oo ay ku jiraan Microsoft 365 Copilot, Copilot, GitHub Copilot, iyo Azure AI Foundry.
Si aad u bilowdo, eeg dukumentiyada(ku furmaa daaqad cusub), faahfaahinta qiimaha(ku furmaa daaqad cusub), iyo hagaha weydiinta(ku furmaa daaqad cusub) ee GPT‑5.
Sirdoon
| GPT-5(high) | GPT-5 mini(high) | GPT-5 nano(high) | OpenAI o3(high) | OpenAI o4-mini(high) | GPT-4.1 | GPT-4.1 mini | GPT-4.1 nano | |
|---|---|---|---|---|---|---|---|---|
| AIME ’25(no tools) | 94.6% | 91.1% | 85.2% | 88.9% | 92.7% | 46.4% | 40.2% | - |
| FrontierMath(with python tool only) | 26.3% | 22.1% | 9.6% | 15.8% | 15.4% | - | - | - |
| GPQA diamond(no tools) | 85.7% | 82.3% | 71.2% | 83.3% | 81.4% | 66.3% | 65.0% | 50.3% |
| HLE[1](no tools) | 24.8% | 16.7% | 8.7% | 20.2% | 14.7% | 5.4% | 3.7% | - |
| HMMT 2025(no tools) | 93.3% | 87.8% | 75.6% | 81.7% | 85.0% | 28.9% | 35.0% | - |
[1] Waxaa jira farqi yar oo ku jira tirooyinka lagu sheegay bloggeennii hore, maadaama kuwii hore lagu socodsiiyay nooc duug ah oo HLE ah.
Multimodal
| GPT-5(high) | GPT-5 mini(high) | GPT-5 nano(high) | OpenAI o3(high) | OpenAI o4-mini(high) | GPT-4.1 | GPT-4.1 mini | GPT-4.1 nano | |
|---|---|---|---|---|---|---|---|---|
| MMMU | 84.2% | 81.6% | 75.6% | 82.9% | 81.6% | 74.8% | 72.7% | 55.4% |
| MMMU-Pro(avg across standard and vision sets) | 78.4% | 74.1% | 62.6% | 76.4% | 73.4% | 60.3% | 58.9% | 33.0% |
| CharXiv reasoning(python enabled) | 81.1% | 75.5% | 62.7% | 78.6% | 72.0% | 56.7% | 56.8% | 40.5% |
| VideoMMMU, max frame 256 | 84.6% | 82.5% | 66.8% | 83.3% | 79.4% | 60.9% | 55.1% | 30.2% |
| ERQA | 65.7% | 62.9% | 50.1% | 64.0% | 56.5% | 44.3% | 42.3% | 26.5% |
Koodhin
| GPT-5(high) | GPT-5 mini(high) | GPT-5 nano(high) | OpenAI o3(high) | OpenAI o4-mini(high) | GPT-4.1 | GPT-4.1 mini | GPT-4.1 nano | |
|---|---|---|---|---|---|---|---|---|
| SWE-Lancer: IC SWE Diamond Freelance Coding Tasks | US$112K | US$75K | US$49K | US$86K | US$66K | US$34K | US$31K | US$9K |
| SWE-bench Verified[2] | 74.9% | 71.0% | 54.7% | 69.1% | 68.1% | 54.6% | 23.6% | - |
| Aider polyglot(diff) | 88.0% | 71.6% | 48.4% | 79.6% | 58.2% | 52.9% | 31.6% | 6.2% |
[2] Waxaan ka reebnay 23/500 dhibaato oo aan ka shaqayn karin kaabayaashayada. Liiska buuxa ee 23-ka hawlood ee la reebay waa 'astropy__astropy-7606', 'astropy__astropy-8707', 'astropy__astropy-8872', 'django__django-10097', 'django__django-7530', 'matplotlib__matplotlib-20488', 'matplotlib__matplotlib-20676', 'matplotlib__matplotlib-20826', 'matplotlib__matplotlib-23299', 'matplotlib__matplotlib-24970', 'matplotlib__matplotlib-25479', 'matplotlib__matplotlib-26342', 'psf__requests-6028', 'pylint-dev__pylint-6528', 'pylint-dev__pylint-7080', 'pylint-dev__pylint-7277', 'pytest-dev__pytest-5262', 'pytest-dev__pytest-7521', 'scikit-learn__scikit-learn-12973', 'sphinx-doc__sphinx-10466', 'sphinx-doc__sphinx-7462', 'sphinx-doc__sphinx-8265', iyo 'sphinx-doc__sphinx-9367'.
Raacitaanka Tilmaamaha
| GPT-5(high) | GPT-5 mini(high) | GPT-5 nano(high) | OpenAI o3(high) | OpenAI o4-mini(high) | GPT-4.1 | GPT-4.1 mini | GPT-4.1 nano | |
|---|---|---|---|---|---|---|---|---|
| Scale multichallenge[3](o3-mini grader) | 69.6% | 62.3% | 54.9% | 60.4% | 57.5% | 46.2% | 42.2% | 31.1% |
| Internal API instruction following eval(hard) | 64.0% | 65.8% | 56.1% | 47.4% | 44.7% | 49.1% | 45.1% | 31.6% |
| COLLIE | 99.0% | 98.5% | 96.9% | 98.4% | 96.1% | 65.8% | 54.6% | 42.5% |
[3] Ogow: waxaan ogaanay in qiimeeyaha caadiga ah ee MultiChallenge (GPT-4o) uu inta badan si khaldan u dhibceeyo jawaabaha nooca. Waxaan ogaanay in marka qiimeeyaha loo beddelo nooca caqliyeynta, sida o3-mini, ay si weyn u hagaajiso saxnaanta qiimeynta ee muunadaha aan baaray.
xusida function-ka
| GPT-5(high) | GPT-5 mini(high) | GPT-5 nano(high) | OpenAI o3(high) | OpenAI o4-mini(high) | GPT-4.1 | GPT-4.1 mini | GPT-4.1 nano | |
|---|---|---|---|---|---|---|---|---|
| Tau2-bench airline | 62.6% | 60.0% | 41.0% | 64.8% | 60.2% | 56.0% | 51.0% | 14.0% |
| Tau2-bench retail | 81.1% | 78.3% | 62.3% | 80.2% | 70.5% | 74.0% | 66.0% | 21.5% |
| Tau2-bench telecom | 96.7% | 74.1% | 35.5% | 58.2% | 40.5% | 34.0% | 44.0% | 12.1% |
Macnaha Dheer
| GPT-5(high) | GPT-5 mini(high) | GPT-5 nano(high) | OpenAI o3(high) | OpenAI o4-mini(high) | GPT-4.1 | GPT-4.1 mini | GPT-4.1 nano | |
|---|---|---|---|---|---|---|---|---|
| OpenAI-MRCR: 2 needle 128k | 95.2% | 84.3% | 43.2% | 55.0% | 56.4% | 57.2% | 47.2% | 36.6% |
| OpenAI-MRCR: 2 needle 256k | 86.8% | 58.8% | 34.9% | - | - | 56.2% | 45.5% | 22.6% |
| Graphwalks bfs <128k | 78.3% | 73.4% | 64.0% | 77.3% | 62.3% | 61.7% | 61.7% | 25.0% |
| Graphwalks parents <128k | 73.3% | 64.3% | 43.8% | 72.9% | 51.1% | 58.0% | 60.5% | 9.4% |
| BrowseComp Long Context 128k | 90.0% | 89.4% | 80.4% | 88.3% | 80.0% | 85.9% | 89.0% | 89.4% |
| BrowseComp Long Context 256k | 88.8% | 86.0% | 68.4% | - | - | 75.5% | 81.6% | 19.1% |
| VideoMME(long, with subtitle category) | 86.7% | 78.5% | 65.7% | 84.9% | 79.5% | 78.7% | 68.4% | 55.2% |
Mala-awaallo khaldan
| GPT-5(high) | GPT-5 mini(high) | GPT-5 nano(high) | OpenAI o3(high) | OpenAI o4-mini(high) | GPT-4.1 | GPT-4.1 mini | GPT-4.1 nano | |
|---|---|---|---|---|---|---|---|---|
| LongFact-Concepts hallucination rate(no tools)[lower is better] | 1.0% | 0.7% | 1.0% | 5.2% | 3.0% | 0.7% | 1.1% | - |
| LongFact-Objects hallucination rate(no tools)[lower is better] | 1.2% | 1.3% | 2.8% | 6.8% | 8.9% | 1.1% | 1.8% | - |
| FActScore hallucination rate(no tools)[lower is better] | 2.8% | 3.5% | 7.3% | 23.5% | 38.7% | 6.7% | 10.9% | - |


