U bood nuxurka ugu muhiimsan
OpenAI

Soo bandhigidda LifeSciBench

Halbeeg qiimeyn oo ay khubaro qoreen, oo khubaro dib u eegeen, kuna dhisan cilmi-baarista dhabta ah ee cilmiga nolosha

Soo kacaya…

Nidaamyada AI-ga wakiileed waxay sii awood badanayaan qabashada hawlo saynis. Laakiin waxtarkooda cilmiga nolosha wuxuu ku xiran yahay sida ay u maareeyaan cilmi-baaris dhab ah oo adag. Shaqadaasi badanaa ma aha su’aal xaqiiqo keliya ama saadaal fudud. Cilmi-baarayaashu waxay fasiraan caddeyn dhiman, isu waafajiyaan natiijooyin is khilaafsan, naqshadeeyaan tijaabooyin adag, saxaan assays, qiimeeyaan khataraha u-gudbinta, kana go’aansadaan tallaabada xigta iyadoo aan hubanti jirin.

Halbeegyada hadda jira si buuxda uma cabbiraan awoodahan. Qiimeynno badan oo cilmiga nolosha ah waxay diiradda saaraan meel cidhiidhi ah ama xirfad gooni ah, sidaasna ku keenaan su’aalo qaabaysan iyo jawaabo tixraac oo nadiif ah. Way qiimo leeyihiin, balse badanaa ma caddeeyaan in nooc ka shaqayn karo baaxadda shaqada cilmi-baarista.

LifeSciBench waxaan u samaynay si uu farqigan u yareeyo. Hawl kasta waxay ku dhisan tahay xukunka saynisyahanno cilmiga nolosha ah oo leh tababar Ph.D. iyo waayo-aragnimo toos ah oo helista dawooyinka ka haysta biotech iyo farmashiye.

LifeSciBench wuxuu leeyahay 750 hawlood oo khubaro qoreen, kana kooban 7 socod-shaqo iyo 7 dhoomeyn bayooloji.

1,062

Agabyada hawsha

173

Saynisyahanno wax ku darsaday

19,020

Shuruudaha halbeegyadda qiimeynta

453

Dib-u-eegayaal khubaro ah

Waxa LifeSciBench uu cabbiro

LifeSciBench wuxuu cabbiraa in nidaamyada AI taageeri karaan hawlo dhab ah oo cilmiga nolosha ah, ma aha keliya jawaabaha bayooloji. Si aan u dhisno taksonomiga halbeegga, waxaan saynisyahannada weydiinnay socod-shaqooyinka ay inta badan ku adeegsadaan cilmi-baarista la dabaqo. Jawaabaha waxaan u ururinnay 7 qaybood: maaraynta caddeynta, falanqayn, naqshadeyn iyo hagaajin, caqliyeynta sayniska, xaqiijin iyo hawlgallo, u-gudbin, iyo isgaarsiin saynis.

Hawl kasta waxay u eg tahay codsi saynisyahan siin karo wada-shaqeeye aqoon leh: weydiin saynis, macne ama agab khuseeya, iyo jawaab furan. Halbeegyadda qiimeynta ee ay khubaradu qoreen ayaa qiimeeya in nooc bixin karo jawaab sax ah, faahfaahin ku filan, sababayn, taxaddarro iyo qaab uu saynisyahan filayo.

Dhisidda uruurin xogeed

LifeSciBench wuxuu qiimeeyaa caqliyeynta sayniska iyo xirfadaha ficilka ah ee muhiimka u ah isticmaalka sayniska dhabta ah. Hawluhu waxay noocyada ka rabaan xalinta dhibaatooyin cilmi-baaris oo dhab ah: fasiraad caddeyn, xukun dhoomeyn ku dhisan, iyo gunaanadyo khubaro anfaca. Hawlo badan waxay kaloo u baahan yihiin maaraynta hubanti-la’aanta iyo caqliyeyn ku salaysan faylal xogeed, ee ma aha qoraalka weydiinta oo keliya.

Halbeegga waxaa loo dhisay inuu muujiyo kakanaanta shaqada cilmiga nolosha. Guud ahaan, 79% hawluhu waxay u baahan yihiin tallaabooyin caqliyeyn ama go’aan badan; celcelis ahaan 4 tallaabo hawlkiiba. LifeSciBench wuxuu leeyahay 1,062 agab lifaaqan: jaantusyo, PDFs, miisas, faylal sequence, faylal qaab-dhismeed ama kiimiko, iyo tixraacyo websayd. In ka badan kala bar hawlaha (53%) waxay u baahan yihiin in noocyadu fasiraan ama isku daraan xog ka timaadda ugu yaraan hal agab.

Hawlaha waxaa sameeyay 173 saynisyahanno khubaro ah oo ka kala yimid laamo cilmiga nolosha ah. Mid kasta wuxuu lahaa tababar Ph.D. iyo waayo-aragnimo biotechnology ama farmashiye. Hawluhu waxay mari kareen dib-u-habeyn aan xad lahayn ka hor aqbalaad; kuwa la aqbalay celcelis ahaan waxay mareen 6 dib-u-eegis otomaatig ah iyo ugu yaraan 2 dib-u-eegis khubaro. Dib-u-eegistu waxay ku tiirsanayd jawaab la xaqiijin karo ama is-afgarad khubaro oo xooggan, iyadoo ugu yaraan 90% dib-u-eegayaasha domain-ku heshiiyeen. Habkani wuxuu xaqiijiyay in hawlaha la aqbalay yihiin kuwo saynis ku dhisan, la qiimeyn karo, kana tarjumaya cilmi-baarista la dabaqo.

Jaantus muujinaya hawlaha LifeSciBench oo isku dara ilaha xogta cilmiga nolosha sida taxanaha hidde-sideyaasha, qaab-dhismeedyo molikularka, jaantusyo, dukumentiyo, xaanshi-xogeedyo, iyo linkiyadda websaydka, iyo sidoo kale caqliyeyn tallaabooyin badan leh iyo dib-u-eegis khubaro.

Darajeynta iyo halbeegyadda qiimeynta

Hawlaha LifeSciBench waxaa lagu qiimeeyaa halbeegyadda qiimeynta oo faahfaahsan oo hawl-gaar ah, kaas oo jawaabta u kala saara sheegashooyin saynis, xisaabin, go’aanno, sababayn iyo kuwo kale. Guud ahaan, halbeegyadda qiimeynta ee khubaradu waxay leeyihiin 19,020 shuruudood—celcelis 25 hawlkiiba—si loo cabbiro saxnaanta sayniska iyo waxtarka go’aamada cilmi-baarista.

Naqshaddani waxay ka tarjumaysaa qiimeynta shaqada sayniska: hawlo badan laguma xukumi karo jawaabta ugu dambeysa oo keliya. Jawaabtu waxay gaari kartaa gunaanad guud oo sax ah, haddana way dhiman kartaa haddii ay seegto xaddid assay muhiim ah ama faahfaahin bayooloji oo saameyn weyn leh. Dhanka kale, jawaab qayb ah waxay yeelan kartaa caqliyeyn tayo leh xitaa haddii aysan hawsha dhammeystirin.

Halbeegyadda qiimeynta ee faahfaahsan ayaa qabta arrintan. LifeSciBench ma eegayo oo keliya saxnaanta jawaabta, ee wuxuu cabbiraa in nooc ku gaaro hab saynis ahaan sax ah oo ficil ahaan waxtar leh.

Soo saaridda, iswaafajinta, iyo hubinta caddaynta cilmiga ee ka timaadda waraaqo, jaantusyo, jadwallo, iyo diiwaanno tijaabo.

Tusaale Qiimayn

We’re preparing for a Type B FDA meeting on AAV9-microDys-X, an AAV9-based micro-dystrophin gene therapy for Duchenne muscular dystrophy that expresses a 138 kDa construct from an MCK promoter, and we want a hard-nosed critique of whether our current package really supports accelerated approval on micro-dystrophin expression as a surrogate endpoint reasonably likely to predict clinical benefit.

Study context: open-label Phase 1b/2 in 12 ambulatory boys age 4–7 with confirmed DMD and out-of-frame rod-domain deletions. The package is:

  • Pre-treatment vastus lateralis biopsies: 0–3% of healthy-control dystrophin by quantitative Western blot using MANEX1A against the N-terminal actin-binding domain.
  • 12-week post-treatment contralateral vastus lateralis biopsies: mean micro-dystrophin 38% of healthy control (range 18–61%) by the same Western blot, normalized to total protein by Coomassie staining.
  • Post-treatment immunofluorescence: sarcolemmal signal in 75–95% of fibers using a polyclonal anti-dystrophin C-terminal antibody.
  • 48-week function: mean NSAA change +1.4 points from baseline versus −0.6 in an external published natural-history registry cohort (p = 0.03 by unpaired t-test).
  • Safety: transient transaminitis in 8/12 patients managed with steroid taper; one resolved myocarditis; no deaths.
  • Biodistribution/persistence: AAV9 vector genomes detectable in muscle at 12 weeks at a mean of 2.3 vector genomes per nucleus.
  • Eligibility: no baseline anti-AAV9 neutralizing antibodies (titer <1:400) and no exon-44 deletions.

Please pressure-test this package item by item: where would FDA or a skeptical reviewer say the evidence, as presented, fails to support our conclusion, and what additional data, analyses, or design changes would be needed to close those gaps if possible? For anything not specified, assume what is known from other clinical-grade 138 kDa micro-dystrophin programs and relevant clinical trials through December 31, 2025 also applies.

Jawaabta Musharraxa

Gunaanad: Sida la soo bandhigay, xirmadani ma aha mid ku filan oo taageerta oggolaanshaha degdegga ah ee muujinta micro-dystrophin oo ah dhibic beddel ah oo macquul ah inay saadaaliso faa'iidada caafimaad. Dib-u-eegaha FDA ee shakiga leh wuxuu u badan tahay inuu u arko caddaynta calaamadda calaamadda, shaqaynta, cimri dhererka, badbaadada, iyo guud ahaanta inay yihiin kuwo aan si weyn loo taageerin.

Shayga Xidhmada

Habka Fashilka Furaha

Maxaa Loo Baahan Doonaa

Qiyaasidda xaddiga Western blot

MANEX1A wuxuu ku xirmaa goob aqoonsi oo ku taal dhammaadka N ee ay wadaagaan dystrophin-ka dabiiciga ah ee buuxa iyo micro-dystrophin-ka ka dhasha hidde-sideha la geliyay, sidaas awgeed, baaritaanku si cad uguma kala saari karo micro-dystrophin-ka ka dhashay hidde-sideha la geliyay iyo dystrophin-ka hadhay/dib ugu soo noqday qaabkiisii hore. Sidoo kale, ma habboona in micro-dystrophin-ka 138 kDa ah lagu qiyaaso iyadoo dystrophin-ka buuxa ee dadka caafimaadka qaba loo adeegsanayo halbeeg isbarbardhig ah.

Isticmaal halbeeg recombinant ah oo micro-dystrophin ah, isla markaana adeegso hab kale oo cabbir oo si cad u kala saari kara hidde-sideha la geliyay iyo dystrophin-ka dabiiciga ah, sida cabbirka bartilmaameedsan ee cufka molikuyuullada ama baaritaan u gaar ah hidde-sideha la geliyay ama goobta aqoonsiga.

Baaritaanka iftiinka-calaamadeysan

Lidka-jirka polyclonal ee bartilmaameedsada dhammaadka C si fiican uguma habboona, sababtoo ah qaab-dhismeedka 138 kDa ah ma laha qaybta dhammaadka C. Bukaanno badan oo DMD ah ayaa leh fiilooyin soo noqnoqda, dystrophin-ka soo noqdana wuxuu hayn karaa epitopes-ka C-terminal. Faybaraha soo noqnoqda ayaa si isku mid ah u fidi kara da'da, taasoo u janjeerta calaamadda IF, gaar ahaan wiilasha waaweyn.

Ku celi baaritaanka IF adigoo adeegsanaya ka-hortag-jir bartilmaameedsanaya qayb aqoonsi oo ku jirta hidde-sideha la geliyay, balse aan ku jirin borotiinka dystrophin ee dib ugu noqday qaabkiisii hore. Si gooni ah u qiyaas tirada faybarada muujinaya hidde-sideha la geliyay iyo tirada faybarada dib ugu soo laabtay qaabkoodii caadiga ahaa.

Ansixinta dhammaadka booska beddelka ah

Xidhmadu waxay isku daraysaa xaddiga borotiinka iyo shaqada caafimaad. "38% tirada borotiinka ee caafimaadka qabta" macnaheedu maaha 38% shaqada caadiga ah ee dystrophin sababtoo ah micro-dystrophin si qaab-dhismeed ah ayaa loo jaray.

Si cad u xaqiiji xiriirka ka dhexeeya boqolkiiba tirada yar ee dystrophin, deegaanka sarcolemmal, soo celinta shaqada ee hoos u socota, iyo faa'iidada caafimaad ka hor inta aan loo daaweyn muujinta inay tahay bartilmaameedka beddelka ah.

Naqshadeynta baaritaan ku sameynta cad laga soo jaray jirka

Cad-qaadyada muruqa bawdada sare ee la qaaday ka hor iyo ka dib daaweynta, iyadoo laga qaaday labada dhinac ee jirka, waxay soo kordhinayaan kala duwanaansho u dhexeeya dhinaca bidix iyo midig iyo kala duwanaansho ka dhex jira qaybaha kala duwan ee isla muruqa. Horumarka cudurka iyo ku-beddelanka muruqa ee unugyo dufan iyo unugyo xidhaya ayaa sidoo kale saameyn ku yeelan kara heerka tilmaanta ee lagu caadiyeeyay wadarta guud ee borotiinka.

Habee goobta baaritaanka dhiigga iyadoo la adeegsanayo calaamado jireed oo joogto ah, caadi u dhig borotiinnada gaarka u ah muruqa, oo si is barbar socda u cabbir ka koobananshaha unugyada dufanka iyo unugyada xidhaya.

Isbarbardhigga/istaatistikada NSAA

Koox-taariikh-dabiici ah oo dibadda ah ma aha xakamayn isku mar ah oo aan kala sooc lahayn. U-qalmitaanka tijaabada, daryeelka taageerada, saameynta ka qaybgalka, NSAA-da aasaasiga ah, nidaamka isteroeedhka, da'da, iyo fasalka exon dhammaantood waxay u janjeeraan isbarbardhigga. Tijaabo t- ah oo aan la isku xidhin kuma filna. Sidoo kale, isbeddelka +1.4 NSAA wuxuu ku jiraa kala duwanaanshaha tijaabada-dib-u-imtixaanka ee kooxdan da'da ah.

Samee daraasad la kala soocay oo isku mar la xakameeyey oo aan wax saameyn ah lahayn, ama ugu yaraan isticmaal falanqayn la hagaajiyey oo xisaabinaya NSAA-da aasaasiga ah, da'da, nidaamka isterioodhka, fasalka exon, iyo waxyaabo kale oo jahawareer leh.

Jahawareerka ka dhasha kala duwanaanshaha da'da

Wiilasha da'doodu u dhaxayso 4-7 waxay ku jiraan daaqad koritaan halkaas oo bukaanada DMD ee aan la daaweyn ee bukaanku ay ku shaqeyn karaan dhaqdhaqaaqa jirka ka hor inta aysan hoos u dhicin. Isbeddelka NSAA ee 48-toddobaad ah wuxuu isku daraa kobaca koritaanka, horumarka cudurka, iyo saameynta daaweynta ee suurtagalka ah.

Isticmaal xakameyn aan kala sooc lahayn oo isku mar ah oo leh kala-soocid da'da si aad u kala saarto jihada koritaanka iyo saameynta daaweynta.

Waayo-aragnimo caafimaad oo hore

Calaamadaha shaqada ee micro-dystrophin ee furan si kalsooni leh uma saadaalin faa'iidada xaqiijinta; tusaalaha la daabacay waxaa ka mid ah tijaabooyinka xaqiijinta daaweynta hiddaha ee micro-dystrophin oo ku guuldareystay inay soo saaraan horumarinta calaamadda furan ee NSAA.

Ha ku tiirsanaan isbeddelka NSAA ee calaamadda furan sida taageero go'aan qaadasho leh. U baahan caddayn shaqayneed oo la xakameeyey.

Xadka qaab-dhismeedka ee dhismaha

Dhismaha 138 kDa wuxuu tirtiraa spectrin-ka oo ku celceliya R16/17, kaas oo ka kooban goobo ku xiran nNOS. Lumitaanka shaqaalaysiinta nNOS waxay wiiqaysaa sympatholysis-ka shaqada iyo ilaalinta ischemia inta lagu jiro jimicsiga, taasoo abuuraysa saqaf farsamo oo ku saabsan samatabbixinta iyada oo aan loo eegin heerka muujinta.

Ku dar daraasado farsamo oo muujinaya in dhismahan gaarka ah uu soo celinayo shaqada adag ee la xiriirta dystrophin, deegaanka nNOS, jimicsiga jirka, iyo ilaalinta murqaha.

Adkeysiga AAV

Hiddo-sidayaasha Vektor-ka marka ay gaaraan 12 toddobaad ma sameeyaan muujin waarta. Hidde-sidayaasha AAV9 badanaa waa kuwo aan isku-dhafanayn waxayna hoos u dhici karaan waqti ka dib. Adkaysiga hidde-sideyaasha Vektor-ka lama mid aha muujinta borotiinka joogtada ah.

Cabbir muujinta borotiinka hidde-sideha la geliyay ee dhererka dheer iyo cimriga calaamadda shaqada ee ka baxsan 12 toddobaad.

Astaamaha difaaca jir/badqadbka

Cudurka kororka ensaymyada beerka ee bukaanada 8/12 waxay la jaanqaadaysaa jawaabta difaaca jirka ee unugyada lagu tallaalay AAV, laakiin habka lama xaqiijin. Hal kiis oo myocarditis ah ayaa khuseeya tropism-ka wadnaha ee AAV9.

Bixi kormeer qoto dheer oo ku saabsan difaaca jirka, astaamaha badbaadada beerka/wadnaha, iyo dabagalka wadnaha oo la xoojiyay.

Xulashada bukaanka/guud ahaanshada bukaanka

Marka laga reebo bukaanada ka hortagga-AAV9 ee dhexdhexaadiya-ka-hortagga-lidka-jirka-ee-dadka-qaba waxay xaddidaysaa guud ahaanshahooda. Ka saarista tirtirka exon-44 waxay xaddidaysaa ku habboonaanta koox-hoosaadka DMD. n=12 aad ayuu u yar yahay si loo qeexo badbaadada iyo waxtarka guud ahaan dadweynaha DMD.

Ballaarinta u-qalmitaanka meesha ay suurtogal tahay ama horay u qeex falanqaynta kala-soocidda iyadoo loo eegayo xaaladda lidka jirka, nooca qaab-dhismeedka hidde-sidaha/qaybta hidde-sideyaasha ee samaysa borotiinnada, da'da, iyo shaqada aasaasiga ah ka hor inta aan la isticmaalin natiijada si loo taageero oggolaanshaha ballaaran.

Gunaanad sharciyeed: Xirmadu waxay muujin kartaa dhaqdhaqaaq bayooloji, laakiin weli ma xaqiijin in muujinta micro-dystrophin ee la cabbiray ay tahay beddel la isku halleyn karo oo si macquul ah u saadaalin kara faa'iidada caafimaad. Farqiga ugu weyn waa gaar ahaanta tijaabada, heerarka cabbirka aan sax ahayn, jahawareerka suurtagalka ah ee fiilada soo noqoshada, la'aanta xakamaynta aan kala sooca lahayn, jahawareerka NSAA ee la xiriira da'da, adkeysiga aan la hubin, iyo arrimaha badbaadada/guud ahaanta aan la xallin.

Si loo soo afjaro farqiga, barnaamijku wuxuu u baahan doonaa naqshad caafimaad oo la xakameeyey oo da'da la kala saaray oo leh tijaabooyin muujin gaar ah oo ku saabsan hidda-wadaha, cabbiraadda borotiinka orthogonal, xakamaynta unugyada, xogta cimri dhererka, tijaabooyinka shaqada farsamada ee dhismaha la jaray, iyo la socodka badbaadada oo xooggan, gaar ahaan beerka iyo wadnaha.

Shuruudaha qiimeynta & Dhibcaha

Shuruud
Dhibcaha
Identifies assay/measurement problems in micro-dystrophin quantification, including MANEX1A epitope sharing, invalid full-length dystrophin standards, and need for recombinant or orthogonal transgene-specific measurement.
+24
Explains why micro-dystrophin expression level is not automatically a valid surrogate for functional clinical benefit.
+22
Flags biopsy-site, tissue-composition, and age-window confounding that weaken expression and NSAA interpretation.
+19
Critiques the NSAA comparator/statistics, especially reliance on external natural-history controls.
+12
Addresses AAV durability, immune response, transaminitis, myocarditis, and need for longer-term expression/safety follow-up.
+15
Notes patient-selection/generalizability gaps, including anti-AAV9 exclusion, exon-44 exclusion, and small sample size.
+8

Xaqiijinta LifeSciBench

LifeSciBench waxaan ku xaqiijinnay dib-u-eegis khubaro madax-bannaan. Jawaab-celin waxay ka timid 453 dib-u-eegayaal aan qorin hawlaha. 97% dib-u-eegayaashu waxay haysteen Ph.D. ama dhakhtarnimo u dhiganta; celcelis ahaan 12 sano waayo-aragnimo iyo 14 maqaal oo peer-reviewed ah; 88% waxay heleen abaalmarin ama deeq-waxbarasho cilmiyeed.

Dib-u-eegayaashu waxay qiimeeyeen in hawl kastaa leedahay sifooyinka su’aal halbeeg adag: la jaanqaad cilmi-baaris dhab ah, tijaabin habboon oo caqliyeynta sayniska iyo khibradda dhoomeynka, xog ku saleysnaan, iyo waxtar lagu cabbiro waxqabadka nooca. Heshiisku qayb kasta wuu ka sarreeyay 96%.

Ku habboonaanta dunida dhabta ah

Hawshani ma ka tarjumaysaa shaqada dhabta ah ee cilmiga nolosha?

Si xooggan ayaan u oggolahay
90.4%
Guud ahaan waan oggolahay
98.3%

Caqliyeynta sayniska / xirfadda dhoomeynka

Hawshani ma tijaabisaa oo ma qiimeysaa caqliyeynta sayniska iyo xirfadaha saxda ah ee cilmiga nolosha?

Si xooggan ayaan u oggolahay
86.4%
Guud ahaan waan oggolahay
98.1%

Ku dhisnaanta sayniska

Hawshani ma ku dhisan tahay saynis, jawaab ma loo heli karaa, mase ku xiran tahay caddeyn, xog, agabyo, ama is-afgarad khubaro oo ku habboon?

Si xooggan ayaan u oggolahay
77.1%
Guud ahaan waan oggolahay
96.5%

Faa'iidada guud

Guud ahaan, tani ma tahay hawl qiimeyn xooggan oo cilmiga nolosha ah?

Si xooggan ayaan u oggolahay
79.1%
Guud ahaan waan oggolahay
96.6%

Faallooyinka dib-u-eegayaashu waxay taageereen qiimeynta tirooyinka:

1 marka loo eegay 3
Guud ahaan waa hawl xooggan, sababtoo ah waxay leedahay fasiraad asaasi ah oo sax ah, iyadoo weli bannaan u reebaysa in jawaabaha fiican lagu kala saaro sida taxaddarka leh ee ay u xadidaan hubanti-la'aanta.

Natiijooyin

Waxaan soo sheegnaa laba cabbir oo is-dhammaystira. Heerka baasidu waa boqolkiiba hawlaha uu nooc ku gaaro xadka guusha hawsha, 70%. Dhibcuhu waa celceliska abaalmarinta halbeegyadda qiimeynta, oo siiya dhibco qayb ah shuruudo gaar ah xitaa marka hawsha aan la wada xallin. Labada cabbirba waa muhiim, maxaa yeelay jawaab saynis waxay ahaan kartaa qayb ahaan sax ama waxtar leh iyadoo aan buuxin shuruud kasta.

Waxqabadka nooca wuxuu ku kala duwanaadaa nooca hawsha, socod-shaqada iyo qaabka jawaabta.

Meelaha nidaamyada AI hore uga xooggan yihiin

LifeSciBench wuxuu muujinayaa in noocyada ugu casriyeysan ugu xooggan yihiin isku-darka sayniska, isgaarsiinta iyo fasiraadda nidaamsan. Heerarka gudubku weli waa hooseeyaan, sidaas darteed domain-yadani ma buuxsamin; haddana GPT‑Rosalind wuxuu ka hormaray GPT‑5.5, isagoo gudubka saxda ah ka qaaday 25.7% ilaa 36.1%.

Horumarka awoodaha nooca wuxuu ugu muuqdaa Isgaarsiinta Sayniska iyo U-gudbinta. Tusaale, gudubka Isgaarsiinta Sayniska wuxuu ka kacaa 56.3% GPT‑5.5 ilaa 71.1% GPT‑Rosalind; qaybtani way yar tahay (n=9), balse waxay muujinaysaa in noocyada ugu casriyeysan si dhakhso ah ugu fiicnaanayaan abaabulka caddeynta iyo sharaxaad khubaro qancisa. U-gudbinta (habka "bench-to-bedside" ee horumarinta dawooyinka) sidoo kale way kordhaa, 36.8% GPT‑5.5 ilaa 57.7% GPT‑Rosalind, taasoo muujinaysa in noocyadu si fiican isugu xirayaan caddeynta daraasadda hordhaca ah iyo saamayntooda caafimaad ee bukaan-socodka.

Natiijooyinka heer-halbeeg qiimeynta sidoo kale taas ayay tilmaamayaan. Hawlaha u baahan soo-saar khubaro anfaca ama lagu dhaqaaqi karo, GPT‑Rosalind wuxuu helaa 44.7%, halka GPT‑5.5 helo 29.1%. Hawlaha u baahan maaraynta hubanti-la’aanta iyo taxaddarrada, wuxuu helaa 44.8%, marka loo eego 29.3%. Qaabkani wuxuu muujinayaa in noocyadu ugu waxtar badan yihiin marka hawshu leedahay xuduud caddeyn oo cad iyo xukun saynis oo nidaamsan.

GPT‑Rosalind ayaa hoggaaminaya waxqabadka hawlaha saynis ahaan qiimaha leh ee ay aqoonsadeen khubarada warshadaha iyo jaamacadaha.

GPT‑Rosalind wuxuu hoggaaminayaa waxqabadka hawlo qiime cilmiyaysan leh oo ay aqoonsadeen khubarada warshadaha iyo tacliinta.

GPT‑Rosalind wuxuu hoggaaminayaa waxqabadka hawlo qiime cilmiyaysan leh oo ay aqoonsadeen khubarada warshadaha iyo tacliinta.

Meelaha nidaamyada AI weli ku liitaan

Waxqabadku aad buu uga daciifsan yahay hawlaha sayniska ee agabyo badan, naqshadeyn badan ama xaddidaad hawlgal leh. Gaar ahaan, Naqshadeyn, Hagaajin & Saadaalin waa socod-shaqo adag: gudubka GPT‑Rosalind waa 30.7%; Falanqayntuna waa 30.3%.

Isticmaalka agabku waa farqi aad u cad. Inkasta oo GPT‑Rosalind ka fiican yahay GPT‑5.5 marka agab badan jiro, gudubkiisu wuxuu ka dhacaa 45.1% hawlaha qoraal-keliya ilaa 28.1% hawlaha leh agab ama URLs. GPT‑5.5 sidoo kale wuu dhacaa, 29.9% ilaa 21.9%. Falanqayn dheeraad ah waxay caddeysaa in noocyada ugu casriyeysan ku dhibtoonayaan soo saarista xogta jaantusyo adag ama faylal sequence waaweyn iyo gelinteeda jawaabta ugu dambeysa.

Heerarka gudubku way dhacaan marka hawluhu u baahan yihiin caqliyeyn xog xaqiijin ku saleysan ama la shaqeynta agabyo

Qaabka jawaabtuna waa muhiim. Hawlaha u baahan sequence, qaab-dhismeed ama soo-saar construct sax ah waxay leeyihiin gudub hoose: GPT‑Rosalind wuxuu gaaraa 14.8% hawlaha tirooyinka iyo 24.0% soo-saarka taxanaha ama qaab-dhismeed. Hawlaha abuurista qaab-dhismeedyada way nugul yihiin: GPT‑Rosalind waa 27.3% oo wax yar ayuu ka hormaraa GPT‑5.5. Farqiga qaarkii wuxuu ka iman karaa qiimeyn adag oo hawlaha jawaabta saxda ah ah, halka khalad yar oo xisaab ama qaabeyn ahi hoos u dhigi karo jawaabta. Haddana guuldarrooyinkani waa muhiim, sababtoo ah socod-shaqooyin badan waxay u baahan yihiin soo-saar si toos ah loo adeegsan karo, sida naqshadeynta ku deeqaha CRISPR/HDR ama siRNA.

Noocyadu marar badan way soo dhowaadaan, balse hawsha ma dhammeeyaan. Qiyaastii 14% hawlaha, noocyadu waxay heleen dhibco halbeegyadda qiimeynta ah oo badan iyagoo aan gaarin xadka gudubka saxda ah. GPT‑Rosalind, 109 hawlood ayaa lahaa gudub ka hooseeya 20% balse weli helay ugu yaraan 50% abaalmarinta halbeegyadda qiimeynta. Ficil ahaan, noocyadu way heli karaan caddeyn khuseysa ama jawaab qayb ah, balse way fashilmaan haddii ay seegaan xannibaad muhiim ah, adeegsadaan caddeyn qaldan, sameeyaan xisaab dhiman, ama caqliyeynta aan ku xirin go’aan saynis oo waxtar leh.

Xaddidaadaha & waxa xiga

LifeSciBench waa tallaabo lagu cabbirayo waxtarka nidaamyada AI ee cilmi-baarista cilmiga nolosha, balse ma beddelo barashada noocyada gudaha cilmi-baaris nool. Halbeeggu wuxuu diiradda saaraa hawlo isku filan oo ka tarjumaya socod-shaqooyin warshadeed oo soo noqnoqda, isagoo ka tagaya takhasusyo iyo hawlo badan oo kale. Cilmi-baarista dhabta ahi waa wareegto: saynisyahannadu waxay ururiyaan caddeyn cusub, hagaajiyaan mala-awaal, naqshadeeyaan tijaabooyin xiga, qorshahana la qabsiiyaan natiijooyinka.

Sidaas darteed, waxqabad xooggan oo LifeSciBench ah waa caddeyn awood hawl-dhab ah, ee ma aha cabbir toos ah oo saamaynta cilmi-baarista dambe. Halbeeggu wuxuu ku dhisan yahay socod-shaqooyin warshadeed, balse ma qabto kala-duwanaanta iyo dhaqdhaqaaqa buuxa ee barnaamijyada cilmi-baaris nool.

Tallaabada xigta waa in waxqabadka halbeegga lagu xiro daraasado deployment ah oo ka socda socod-shaqooyin cilmi-baaris nool. Inkastoo LifeSciBench lala sameeyay saynisyahanno shaqeeya, cabbiridda in nidaamyada AI dedejiyaan helitaan ama hagaajiyaan natiijooyinka R&D waxay u baahan tahay barashada isticmaalka iyo waxqabadka nooca gudaha cilmi-baaris dhab ah, muddo dheer, iyo wareegyo caqliyeyn, jawaab-celin iyo dabagal tijaabo ah.

Ka qaybgal

Ka caawi qaabeynta jiilka xiga ee halbeegyada AI ee cilmiga nolosha, ama codso gelitaanka GPT-Rosalind.

Qoraa

OpenAI