Unlocking the Codex harness: how we built the App Server
By Celia Chen, Member of the Technical Staff
OpenAI’s coding agent Codex exists across many different surfaces: the web app(jinfetaħ f’tieqa ġdida), the CLI(jinfetaħ f’tieqa ġdida), the IDE extension(jinfetaħ f’tieqa ġdida), and the new Codex macOS app. Under the hood, they’re all powered by the same Codex harness—the agent loop and logic that underlies all Codex experiences. The critical link between them? The Codex App Server(jinfetaħ f’tieqa ġdida), a client-friendly, bidirectional JSON-RPC1 API.
In this post, we’ll introduce the Codex App Server; we’ll share our learnings so far on the best ways to bring Codex’s capabilities into your product to help your users supercharge their workflows. We’ll cover the App Server’s architecture and protocol and how it integrates with different Codex surfaces, as well as tips on leveraging Codex, whether you want to turn Codex into a code reviewer, an SRE agent, or a coding assistant.
Before diving into architecture, it’s helpful to know the App Server’s backstory. Initially, the App Server was a practical way to reuse the Codex harness across products that gradually evolved into our standard protocol.
Codex CLI started as a TUI (terminal user interface), meaning Codex is accessed through the terminal. When we built the VS Code extension (a more IDE-friendly way to interact with Codex agents), we needed a way to use the same harness so as to drive the same agent loop from an IDE UI without re-implementing it. That meant supporting rich interaction patterns beyond request/response, such as exploring the workspace, streaming progress as the agent reasons, and emitting diffs. We first experimented with exposing Codex as an MCP server(jinfetaħ f’tieqa ġdida), but maintaining MCP semantics in a way that made sense for VS Code proved difficult. Instead, we introduced a JSON-RPC protocol that mirrored the TUI loop, which became the unofficial first version(jinfetaħ f’tieqa ġdida) of the App Server. At the time, we didn’t expect other clients to depend on the App Server, so it wasn’t designed as a stable API.
As Codex adoption grew over the next few months, internal teams and external partners wanted the ability to embed the same harness in their own products in order to accelerate their users’ software development workflows. For example, JetBrains and Xcode wanted an IDE-grade agent experience, while the Codex desktop app needed to orchestrate many Codex agents in parallel. Those demands pushed us to design a platform surface that both our products and partner integrations could safely depend on over time. It needed to be easy to integrate and backward compatible, meaning we could evolve the protocol without breaking existing clients.
Next, we’ll walk through how we designed the architecture and protocol so different clients can use the same harness.
First, let’s zoom in on what’s inside the Codex harness and how the Codex App Server exposes it to clients. In our last Codex blog, we broke down the core agent loop that orchestrates the interaction between the user, the model, and the tools. This is the core logic of the Codex harness, but there’s more to the full agent experience:
1. Thread lifecycle and persistence. A thread is a Codex conversation between a user and an agent. Codex creates, resumes, forks, and archives threads, and persists the event history so clients can reconnect and render a consistent timeline.
2. Config and auth. Codex loads configuration, manages defaults, and runs authentication flows like “Sign in with ChatGPT,” including credential state.
3. Tool execution and extensions. Codex executes shell/file tools in a sandbox and wires up integrations like MCP servers and skills so they can participate in the agent loop under a consistent policy model.
All the agent logic we mentioned here, including the core agent loop, lives in a part of the Codex CLI codebase called “Codex core(jinfetaħ f’tieqa ġdida).” Codex core is both a library where all the agent code lives and a runtime that can be spun up to run the agent loop and manage the persistence of one Codex thread (conversation).
To be useful, the Codex harness needs to be accessible to clients. That’s where the App Server comes in.
The App Server is both the JSON-RPC protocol between the client and the server and a long-lived process that hosts the Codex core threads. As we can see from the diagram above, an App Server process has four main components: the stdio reader, the Codex message processor, the thread manager, and core threads. The thread manager spins up one core session for each thread, and the Codex message processor then communicates with each core session directly to submit client requests and receive updates.
One client request can result in many event updates, and these detailed events are what allow us to build a rich UI on top of the App Server. Furthermore, the stdio reader and the Codex message processor serve as the translation layer between the client and Codex core threads. They translate client JSON-RPC requests into Codex core operations, listen to Codex core’s internal event stream, and then transform those low-level events into a small set of stable, UI-ready JSON-RPC notifications.
The JSON-RPC protocol between the client and the App Server is fully bidirectional. A typical thread has a client request and many server notifications. In addition, the server can initiate requests when the agent needs input, like an approval, and then pause the turn until the client responds.
Next, we’ll break down the conversation primitives, the building blocks of the App Server protocol. Designing an API for an agent loop is tricky because the user/agent interaction is not a simple request/response. One user request can unfold into a structured sequence of actions that the client needs to represent faithfully: the user’s input, the agent’s incremental progress, artifacts produced along the way (e.g., diffs). To make that interaction stream easy to integrate and resilient across UIs, we landed on three core primitives with clear boundaries and lifecycles:
1. Item: An item is the atomic unit of input/output in Codex. Items are typed (e.g., user message, agent message, tool execution, approval request, diff) and each has an explicit lifecycle:
item/startedwhen the item begins- optional
item/*/deltaevents as content streams in (for streaming item types) item/completedwhen the item finalizes with its terminal payload
This lifecycle lets clients start rendering immediately on started, stream incremental updates on delta, and finalize on completed.
2. Turn: A turn is one unit of agent work initiated by user input. It begins when the client submits an input (for example, “run tests and summarize failures”) and ends when the agent finishes producing outputs for that input. A turn contains a sequence of items that represent the intermediate steps and outputs produced along the way.
3. Thread: A thread is the durable container for an ongoing Codex session between a user and an agent. It contains multiple turns. Threads can be created, resumed, forked, and archived. Thread history is persisted so clients can reconnect and render a consistent timeline.
Now, we’ll look at a simplified conversation between a client and an agent, where the conversation is represented by primitives:
At the beginning of the conversation, the client and the server need to establish the initialize handshake. The client must send a single initialize request before any other method, and the server acknowledges with a response. This gives the server a chance to advertise capabilities and lets both sides agree on protocol versioning, feature flags, and defaults before the real work begins. Here’s an example payload from OpenAI’s VS Code extension:
This is what the server returns:
When a client makes a new request, it will first create a thread and then a turn. The server will send back notifications for progress (thread/started and turn/started). It will also send back inputs it registers as items, like the user message here.
Tool calls are also sent back to the client as items. Additionally, the server may ask for client approval before it can run an action by sending a server request. The approval will pause the turn until the client replies with either “allow” or “deny.” This is what the approval flow looks like in the VS Code extension:

In the end, the server sends an agent message and then ends the turn with turn/completed. The agent message delta events stream pieces of the message back until the message is finalized with item/completed.
Il-messaġġi fid-dijagramma huma ssimplifikati biex ikunu aktar faċli biex jinqraw. Jekk trid tara l-JSON għal turn sħiħ, tista’ tħaddem il-klijent tat-test mir-repo tal-Codex CLI:
Issa, ejja nħarsu lejn kif interfaces differenti tal-klijent jinkorporaw Codex permezz tal-App Server. Se nkopru tliet mudelli: apps lokali u IDEs, Codex web runtime, u t-TUI.
Fit-tlieta kollha, it-trasport huwa JSON-RPC fuq stdio (JSONL). JSON-RPC jagħmilha faċli biex tibni bindings tal-klijent fil-lingwa tal-għażla tiegħek. Interfaces ta’ Codex u integrazzjonijiet ma’ sħabna implimentaw klijenti tal-App Server f’lingwi fosthom Go, Python, TypeScript, Swift, u Kotlin. Għal TypeScript, tista’ tiġġenera definizzjonijiet direttament mill-protokoll Rust billi tħaddem:
Għal lingwi oħra, tista’ tiġġenera pakkett JSON Schema u ddaħħlu fil-ġeneratur tal-kodiċi preferut tiegħek billi tħaddem:

Klijenti lokali tipikament jinkludu jew iġibu binary tal-App Server speċifiku għall-pjattaforma, inieduh bħala child process li jdum għaddej, u jżommu kanal stdio bidirezzjonali miftuħ għal JSON-RPC. Fl-estensjoni tagħna għal VS Code u fl-App tad-Desktop, pereżempju, l-artifact li jintbagħat jinkludi l-binary Codex speċifiku għall-pjattaforma u jkun marbut ma’ verżjoni ttestjata biex il-klijent dejjem iħaddem eżattament il-bits li vvalidajna.
Mhux kull integrazzjoni tista’ tibgħat aġġornamenti tal-klijent ta’ spiss. Xi sħab bħal Xcode jifirdu ċ-ċikli tar-rilaxx billi jżommu l-klijent stabbli u jħalluh jindika binary App Server aktar ġdid meta jkun hemm bżonn. B’dan il-mod jistgħu jadottaw titjib min-naħa tas-server (pereżempju, auto-compaction aħjar fil-qalba ta’ Codex jew config keys appoġġjati ġodda) u jwasslu bug fixes mingħajr ma jistennew rilaxx tal-klijent. Is-superfiċi JSON-RPC tal-App Server hija mfassla biex tkun kompatibbli lura, sabiex klijenti eqdem ikunu jistgħu jitkellmu ma’ servers aktar ġodda b’mod sigur.

Codex Web juża l-harness ta’ Codex, iżda jħaddmu f’ambjent ta’ container. Worker jipprovdi container bil-workspace diġà checked out, iniedi l-binary tal-App Server ġo fih, u jżomm kanal JSON-RPC fuq stdio2 fit-tul. Il-web app (li taħdem fit-tab tal-browser tal-utent) titkellem mal-backend ta’ Codex fuq HTTP u SSE, li jxandar l-avvenimenti tat-task prodotti mill-worker. Dan iżomm l-UI min-naħa tal-browser ħafifa, filwaqt li xorta jagħtina runtime konsistenti bejn desktop u web.
Minħabba li s-sessjonijiet tal-web huma effimeri (tabs jingħalqu, in-netwerks jaqgħu), il-web app ma tistax tkun is-sors tal-verità għal tasks li jdumu għaddejjin. Li żżomm l-istat u l-progress fuq is-server ifisser li x-xogħol ikompli anke jekk it-tab tisparixxi. Il-protokoll tal-istreaming u s-sessjonijiet tal-thread salvati jagħmluha faċli biex sessjoni ġdida terġa’ tikkonnettja, tkompli minn fejn waqfet, u tlaħħaq mingħajr ma terġa’ tibni l-istat fil-klijent.

Storikament, it-TUI kien klijent “nattiv” li kien jaħdem fl-istess proċess bħall-agent loop u jitkellem direttament ma’ tipi ewlenin tar-Rust minflok mal-protokoll tal-app-server. Dan għamel l-iterazzjoni bikrija mgħaġġla, iżda għamel ukoll lit-TUI superfiċi ta’ każ speċjali.
Issa li jeżisti l-App Server, qed nippjanaw li nirrifatturaw it-TUI(jinfetaħ f’tieqa ġdida) biex jużah ħalli jġib ruħu bħal kull klijent ieħor: iniedi child process tal-App Server, jitkellem JSON-RPC fuq stdio, u jirrendi l-istess avvenimenti ta’ streaming u approvazzjonijiet. Dan jiftaħ workflows fejn it-TUI jista’ jikkonnettja ma’ server Codex li qed jaħdem fuq magna remota, iżomm lill-aġent qrib il-compute u jkompli x-xogħol anke jekk il-laptop jidħol sleep jew jinqata’, filwaqt li xorta jwassal aġġornamenti live u kontrolli lokalment.
Codex App Server se jkun il-metodu ta’ integrazzjoni ewlieni li se nżommu ‘l quddiem, iżda hemm ukoll metodi oħra b’funzjonalità aktar limitata. B’mod awtomatiku, nirrakkomandaw li l-klijenti jużaw Codex App Server biex jintegraw ma’ Codex, iżda ta’ min jagħti ħarsa lejn il-metodi differenti ta’ integrazzjoni u tifhem il-vantaġġi u l-iżvantaġġi tagħhom. Hawn taħt hemm l-aktar modi komuni biex tħaddem Codex u meta kull wieħed jista’ jkun adattat.
Ħaddem codex mcp-server(jinfetaħ f’tieqa ġdida) u qabbad minn kwalunkwe klijent MCP li jappoġġja stdio servers (eż., OpenAI Agents SDK(jinfetaħ f’tieqa ġdida)). Dan huwa adattat jekk diġà għandek workflow ibbażat fuq MCP u trid issejjaħ lil Codex bħala għodda li tista’ tissejjaħ. L-iżvantaġġ hu li tikseb biss dak li jesponi MCP, għalhekk interazzjonijiet speċifiċi għal Codex li jiddependu fuq semantika ta’ sessjoni aktar rikka (eż., aġġornamenti tad-diff) jistgħu ma jiġux immappjati tajjeb minn endpoints ta’ MCP.
Xi ekosistemi joffru interface portabbli li jista’ jimmira lejn diversi fornituri ta’ mudelli u runtimes. Dan jista’ jkun adattat jekk trid astrazzjoni waħda li tikkoordina diversi aġenti. Il-kompromess hu li dawn il-protokolli spiss jikkonverġu fuq is-subsett komuni tal-kapaċitajiet, li jista’ jagħmel interazzjonijiet aktar sinjuri aktar diffiċli biex jiġu rrappreżentati, speċjalment meta semantika ta’ għodod u sessjonijiet speċifiċi għall-fornitur tkun importanti. Dan l-ispazju qed jevolvi malajr, u nistennew li joħorġu standards aktar komuni hekk kif nifhmu l-aħjar primittivi biex jirrappreżentaw workflows reali ta’ aġenti (skills(jinfetaħ f’tieqa ġdida) huwa eżempju tajjeb ta’ dan).
Agħżel l-App Server meta trid il-harness sħiħ ta’ Codex espost bħala stream ta’ avvenimenti stabbli u faċli għall-UI. Ikollok kemm il-funzjonalità sħiħa tal-agent loop kif ukoll karatteristiċi oħra ta’ appoġġ bħal Sign in with ChatGPT, skoperta tal-mudelli, u ġestjoni tal-konfigurazzjoni. L-ispiża ewlenija hija x-xogħol tal-integrazzjoni, peress li trid tibni l-binding JSON-RPC min-naħa tal-klijent fil-lingwa tiegħek. Fil-prattika, madankollu, Codex kapaċi jagħmel ħafna mix-xogħol tqil jekk tagħtih il-JSON schema u d-dokumentazzjoni. Ħafna timijiet li ħdimna magħhom setgħu jaslu għal integrazzjoni li taħdem malajr billi użaw Codex.
Modalità CLI ħafifa u skriptabbli għal tasks ta’ darba u runs ta’ CI. Hija adattata għall-awtomazzjoni u pipelines fejn trid kmand wieħed li jaħdem sal-aħħar mingħajr interazzjoni, ixandar output strutturat għal logs, u joħroġ b’sinjal ċar ta’ suċċess jew falliment.
Librerija TypeScript għall-kontroll programmatiku ta’ aġenti lokali ta’ Codex minn ġewwa l-applikazzjoni tiegħek stess. Hija l-aħjar meta trid interface ta’ librerija nattiva għal għodod u workflows min-naħa tas-server mingħajr ma tibni klijent JSON-RPC separat. Peress li ntbagħtet qabel l-App Server, bħalissa tappoġġja inqas lingwi u superfiċi iżgħar. Jekk ikun hemm interess mill-iżviluppaturi, nistgħu nżidu SDKs addizzjonali li jdawru l-protokoll tal-App Server sabiex it-timijiet ikopru aktar mis-superfiċi tal-harness mingħajr ma jiktbu bindings JSON-RPC.
F’din il-kariga, qsamt kif nersqu lejn id-disinn ta’ standard ġdid għall-interazzjoni mal-aġenti u kif nibdlu l-harness ta’ Codex fi protokoll stabbli u faċli għall-klijenti. Koprejna kif l-App Server jesponi l-qalba ta’ Codex, iħalli lill-klijenti jmexxu l-agent loop kollu, u jħaddem firxa wiesgħa ta’ interfaces inklużi t-TUI, integrazzjonijiet lokali tal-IDEs, u l-web runtime.
Jekk dan qajjem ideat biex tintegra Codex fil-workflows tiegħek stess, ta’ min tipprova l-App Server. Il-kodiċi kollu tas-sors jinsab fir-repo(jinfetaħ f’tieqa ġdida) open-source tal-Codex CLI. Ħossok liberu taqsam il-feedback u t-talbiet tiegħek għal karatteristiċi. Aħna eċċitati nisimgħu mingħandek u nkomplu nagħmlu l-aġenti aktar aċċessibbli għal kulħadd.
Awtur
Rikonoxximenti
Ringrazzjament speċjali lil Michael Bolin, Owen Lin, Eric Traut, u Rasmus Rygaard, li kkontribwew għal din il-kariga, u lit-tim kollu ta’ Codex li ħadem fuq l-App Server.
Noti fil-qiegħ tal-paġna
- 1
Aħna nużaw varjant “JSON‑RPC lite”: iżomm il-forma ta’ request/response/notification, iżda jħalli barra l-header
"jsonrpc": "2.0"u jiġi inkwadrat bħala JSONL fuq stdio minflok JSON‑RPC 2.0 strett. - 2
“stdio” jirreferi għall-istdin/stdout tal-app-server ġewwa l-container. F’setups ospitati, dawk l-istreams spiss jgħaddu minn mina fuq konnessjoni tan-netwerk persistenti (eż., bħal WebSocket) lejn ir-runtime tal-container—għalhekk iġib ruħu bħal stdio anke jekk ma jkunx pipe lokali litterali.


