
Nooca cusub, text-embedding-ada-002, wuxuu beddelayaa shan nooc oo kala duwan oo loogu talagalay raadinta qoraalka, isku ekaanshaha qoraalka, iyo raadinta koodhka, wuxuuna ka waxqabad fiican yahay noocii hore ee noogu awoodda badnaa, Davinci, inta badan hawlaha, halka qiimihiisu 99.8% ka hooseeyo.
Embeddings waa matalaado tirooyin ah oo fikrado loo beddelay taxane tirooyin ah, kuwaas oo kombiyuutarrada u sahla inay fahmaan xiriirrada u dhexeeya fikradahaas. Tan iyo soo saaristii hore ee bar-dhammaadka OpenAI /embeddings(ku furmaa daaqad cusub), codsiyo badan ayaa ku daray embeddings si ay u shakhsiyeeyaan, u soo jeediyaan, una raadiyaan nuxurka.
Waxaad ku waydiin kartaa bar-dhammaadka /embeddings(ku furmaa daaqad cusub) nooca cusub adigoo isticmaalaya laba sadar oo koodh ah oo ku jira Maktabadda Python ee OpenAI(ku furmaa daaqad cusub), sida aad ugu samayn kartay noocyadii hore:
import openai
response = openai.Embedding.create(
input="porcine pals say",
model="text-embedding-ada-002"
)Waxqabad ka xoog badan. text-embedding-ada-002 wuxuu ka waxqabad fiican yahay dhammaan noocyadii hore ee embedding-ka ee hawlaha raadinta qoraalka, raadinta koodhka, iyo isku ekaanshaha weedhaha, wuxuuna gaaraa waxqabad la mid ah kala soocidda qoraalka. Qayb kasta oo hawleed, waxaan ku qiimeynaa noocyada xog-ururinta lagu isticmaalay embeddings-kii hore(ku furmaa daaqad cusub).
| Moodo | Bandhigga |
| text-embedding-ada-002 | 53.3 |
| text-search-davinci-*-001 | 52.8 |
| text-search-curie-*-001 | 50.9 |
| text-search-babbage-*-001 | 50.4 |
| text-search-ada-*-001 | 49.0 |
Mideynta awoodaha. Waxaan si weyn u fududeynay is-dhexgalka bar-dhammaadka /embeddings(ku furmaa daaqad cusub) annagoo shanta nooc ee kala duwan ee kor ku xusan (text-similarity, text-search-query, text-search-doc, code-search-text iyo code-search-code) ku darnay hal nooc oo cusub. Hal matalaaddan keliya ayaa ka waxqabad fiican noocyadeennii embedding-ka ee hore marka loo eego halbeegyo kala duwan oo raadinta qoraalka, isku ekaanshaha weedhaha, iyo raadinta koodhka ah.
Macne dheer. Dhererka macnaha guud ee nooca cusub waxaa loo kordhiyay afar laab, laga soo qaaday 2048 ilaa 8192, taas oo ka dhigaysa mid ka habboon in lagu shaqeeyo dukumeenti dhaadheer.
Cabbirka embedding-ka oo yar. Embeddings-ka cusub waxay leeyihiin 1536 cabbir oo keliya, taas oo ah sideed meelood meel cabbirka embeddings-ka davinci-001, taas oo embeddings-ka cusub ka dhigaysa kuwo kharash ahaan ka jaban marka lala shaqeynayo keydadka vector-ka.
Qiime dhimman. Waxaan qiimaha noocyada embedding-ka cusub hoos ugu dhignay 90% marka loo eego noocyadii hore ee isla cabbirka ahaa. Nooca cusub wuxuu gaaraa waxqabad ka fiican ama la mid ah kii noocyadii hore ee Davinci iyadoo qiimuhu 99.8% ka hooseeyo.
Guud ahaan, nooca embedding-ka cusub waa qalab aad uga awood badan oo loogu talagalay habaynta luqadda dabiiciga ah iyo hawlaha koodhka. Waxaan ku faraxsanahay inaan aragno sida macaamiisheennu ugu adeegsan doonaan si ay u abuuraan codsiyo xitaa ka awood badan dhinacyadooda kala duwan.
Nooca cusub ee text-embedding-ada-002 kama fiicna text-similarity-davinci-001 marka la eego halbeegga kala-soocidda linear probing ee SentEval. Hawlaha u baahan in lagu tababaro lakab linear ah oo fudud korka vector-rada embedding-ka si loo saadaaliyo kala-soocid, waxaan kugula talineynaa inaad barbar dhigto nooca cusub iyo text-similarity-davinci-001 oo aad doorato nooca bixiya waxqabadka ugu fiican.
Ka eeg qaybta Xaddidaadaha & Khataraha(ku furmaa daaqad cusub) ee dukumeentiga embeddings-ka si aad u aragto xaddidaadaha guud ee noocyadeenna embedding-ka.
Kalendar AI(ku furmaa daaqad cusub) waa badeecad gaarsiinta iibka ah oo isticmaasha embeddings si ay ugu waafajiso bandhigga iibka saxda ah macaamiisha saxda ah ee ku jira xog-ururin ka kooban 340M profile. Iswadaantani waxay ku tiirsan tahay isu ekaanshaha u dhexeeya embeddings-ka profile-yada macaamiisha iyo bandhigyada iibka si loo kala hormariyo is-waafajinnada ugu habboon, iyadoo meesha ka saaraysa 40–56% beegsiga aan loo baahnayn marka loo eego habkoodii hore.
Notion(ku furmaa daaqad cusub), shirkadda goobta shaqada ee onlaynka ah, waxay isticmaali doontaa embeddings-ka cusub ee OpenAI si ay u hagaajiso raadinta Notion kana gudubto nidaamyada maanta ee ku saleysan isbarbardhigga ereyada muhiimka ah.


