Unlocking the Codex harness: how we built the App Server
By Celia Chen, Member of the Technical Staff
OpenAI’s coding agent Codex exists across many different surfaces: the web app(mbukak ing jendhela anyar), the CLI(mbukak ing jendhela anyar), the IDE extension(mbukak ing jendhela anyar), and the new Codex macOS app. Under the hood, they’re all powered by the same Codex harness—the agent loop and logic that underlies all Codex experiences. The critical link between them? The Codex App Server(mbukak ing jendhela anyar), a client-friendly, bidirectional JSON-RPC1 API.
In this post, we’ll introduce the Codex App Server; we’ll share our learnings so far on the best ways to bring Codex’s capabilities into your product to help your users supercharge their workflows. We’ll cover the App Server’s architecture and protocol and how it integrates with different Codex surfaces, as well as tips on leveraging Codex, whether you want to turn Codex into a code reviewer, an SRE agent, or a coding assistant.
Before diving into architecture, it’s helpful to know the App Server’s backstory. Initially, the App Server was a practical way to reuse the Codex harness across products that gradually evolved into our standard protocol.
Codex CLI started as a TUI (terminal user interface), meaning Codex is accessed through the terminal. When we built the VS Code extension (a more IDE-friendly way to interact with Codex agents), we needed a way to use the same harness so as to drive the same agent loop from an IDE UI without re-implementing it. That meant supporting rich interaction patterns beyond request/response, such as exploring the workspace, streaming progress as the agent reasons, and emitting diffs. We first experimented with exposing Codex as an MCP server(mbukak ing jendhela anyar), but maintaining MCP semantics in a way that made sense for VS Code proved difficult. Instead, we introduced a JSON-RPC protocol that mirrored the TUI loop, which became the unofficial first version(mbukak ing jendhela anyar) of the App Server. At the time, we didn’t expect other clients to depend on the App Server, so it wasn’t designed as a stable API.
As Codex adoption grew over the next few months, internal teams and external partners wanted the ability to embed the same harness in their own products in order to accelerate their users’ software development workflows. For example, JetBrains and Xcode wanted an IDE-grade agent experience, while the Codex desktop app needed to orchestrate many Codex agents in parallel. Those demands pushed us to design a platform surface that both our products and partner integrations could safely depend on over time. It needed to be easy to integrate and backward compatible, meaning we could evolve the protocol without breaking existing clients.
Next, we’ll walk through how we designed the architecture and protocol so different clients can use the same harness.
First, let’s zoom in on what’s inside the Codex harness and how the Codex App Server exposes it to clients. In our last Codex blog, we broke down the core agent loop that orchestrates the interaction between the user, the model, and the tools. This is the core logic of the Codex harness, but there’s more to the full agent experience:
1. Thread lifecycle and persistence. A thread is a Codex conversation between a user and an agent. Codex creates, resumes, forks, and archives threads, and persists the event history so clients can reconnect and render a consistent timeline.
2. Config and auth. Codex loads configuration, manages defaults, and runs authentication flows like “Sign in with ChatGPT,” including credential state.
3. Tool execution and extensions. Codex executes shell/file tools in a sandbox and wires up integrations like MCP servers and skills so they can participate in the agent loop under a consistent policy model.
All the agent logic we mentioned here, including the core agent loop, lives in a part of the Codex CLI codebase called “Codex core(mbukak ing jendhela anyar).” Codex core is both a library where all the agent code lives and a runtime that can be spun up to run the agent loop and manage the persistence of one Codex thread (conversation).
To be useful, the Codex harness needs to be accessible to clients. That’s where the App Server comes in.
The App Server is both the JSON-RPC protocol between the client and the server and a long-lived process that hosts the Codex core threads. As we can see from the diagram above, an App Server process has four main components: the stdio reader, the Codex message processor, the thread manager, and core threads. The thread manager spins up one core session for each thread, and the Codex message processor then communicates with each core session directly to submit client requests and receive updates.
One client request can result in many event updates, and these detailed events are what allow us to build a rich UI on top of the App Server. Furthermore, the stdio reader and the Codex message processor serve as the translation layer between the client and Codex core threads. They translate client JSON-RPC requests into Codex core operations, listen to Codex core’s internal event stream, and then transform those low-level events into a small set of stable, UI-ready JSON-RPC notifications.
The JSON-RPC protocol between the client and the App Server is fully bidirectional. A typical thread has a client request and many server notifications. In addition, the server can initiate requests when the agent needs input, like an approval, and then pause the turn until the client responds.
Next, we’ll break down the conversation primitives, the building blocks of the App Server protocol. Designing an API for an agent loop is tricky because the user/agent interaction is not a simple request/response. One user request can unfold into a structured sequence of actions that the client needs to represent faithfully: the user’s input, the agent’s incremental progress, artifacts produced along the way (e.g., diffs). To make that interaction stream easy to integrate and resilient across UIs, we landed on three core primitives with clear boundaries and lifecycles:
1. Item: An item is the atomic unit of input/output in Codex. Items are typed (e.g., user message, agent message, tool execution, approval request, diff) and each has an explicit lifecycle:
item/startedwhen the item begins- optional
item/*/deltaevents as content streams in (for streaming item types) item/completedwhen the item finalizes with its terminal payload
This lifecycle lets clients start rendering immediately on started, stream incremental updates on delta, and finalize on completed.
2. Turn: A turn is one unit of agent work initiated by user input. It begins when the client submits an input (for example, “run tests and summarize failures”) and ends when the agent finishes producing outputs for that input. A turn contains a sequence of items that represent the intermediate steps and outputs produced along the way.
3. Thread: A thread is the durable container for an ongoing Codex session between a user and an agent. It contains multiple turns. Threads can be created, resumed, forked, and archived. Thread history is persisted so clients can reconnect and render a consistent timeline.
Now, we’ll look at a simplified conversation between a client and an agent, where the conversation is represented by primitives:
At the beginning of the conversation, the client and the server need to establish the initialize handshake. The client must send a single initialize request before any other method, and the server acknowledges with a response. This gives the server a chance to advertise capabilities and lets both sides agree on protocol versioning, feature flags, and defaults before the real work begins. Here’s an example payload from OpenAI’s VS Code extension:
This is what the server returns:
When a client makes a new request, it will first create a thread and then a turn. The server will send back notifications for progress (thread/started and turn/started). It will also send back inputs it registers as items, like the user message here.
Tool calls are also sent back to the client as items. Additionally, the server may ask for client approval before it can run an action by sending a server request. The approval will pause the turn until the client replies with either “allow” or “deny.” This is what the approval flow looks like in the VS Code extension:

In the end, the server sends an agent message and then ends the turn with turn/completed. The agent message delta events stream pieces of the message back until the message is finalized with item/completed.
Pesen ing diagram iki wis disederhanakake supaya luwih gampang diwaca. Yen sampeyan pengin ndeleng JSON kanggo siji turn lengkap, sampeyan bisa mbukak klien tes saka gudang kode Codex CLI:
Sabanjure, ayo deleng carane macem-macem antarmuka klien ngemot Codex liwat App Server. Kita bakal mbahas telung pola: app lokal lan IDE, runtime web Codex, lan TUI.
Ing telune, transport-e yaiku JSON-RPC liwat stdio (JSONL). JSON-RPC ndadekake pambangunan binding klien ing basa sing sampeyan pilih dadi gampang. Antarmuka Codex lan integrasi mitra wis ngleksanakake klien App Server ing basa kalebu Go, Python, TypeScript, Swift, lan Kotlin. Kanggo TypeScript, sampeyan bisa ngasilake definisi langsung saka protokol Rust kanthi mbukak:
Kanggo basa liyane, sampeyan bisa ngasilake bundel JSON Schema lan menehi menyang generator kode sing sampeyan senengi kanthi mbukak:

Klien lokal biasane ngemas utawa njupuk biner App Server sing khusus platform, miwiti minangka proses anak sing mlaku suwe, lan njaga kanal stdio loro arah tetep mbukak kanggo JSON-RPC. Contone, ing ekstensi VS Code lan App Desktop, artefak sing dikirim kalebu biner Codex khusus platform lan dipaku ing versi sing wis diuji supaya klien tansah mbukak bit persis sing wis kita validasi.
Ora saben integrasi bisa kerep ngirim nganyari klien. Sawetara mitra kaya Xcode misahake siklus rilis kanthi njaga klien tetep stabil lan ngidini klien nuding menyang biner App Server sing luwih anyar yen dibutuhake. Kanthi cara iki, dheweke bisa ngadopsi perbaikan sisi server (contone, auto-compaction sing luwih apik ing inti Codex utawa kunci config anyar sing didhukung) lan ngluncurake pambenahan bug tanpa ngenteni rilis klien. Antarmuka JSON-RPC App Server dirancang supaya kompatibel mundur, mula klien lawas bisa ngomong karo server anyar kanthi aman.

Codex Web nggunakake harness Codex, nanging mbukak ing lingkungan kontainer. Worker nyedhiyakake kontainer kanthi papan kerja sing wis di-checkout, mbukak biner App Server ing njero, lan njaga kanal JSON-RPC liwat stdio2 sing awet. App web (sing mlaku ing tab browser pangguna) ngomong karo backend Codex liwat HTTP lan SSE, sing ngstream acara tugas sing diprodhuksi worker. Iki njaga UI sisi browser tetep entheng nalika isih menehi runtime sing konsisten ing desktop lan web.
Amarga sesi web sipate sementara (tab ditutup, jaringan putus), app web ora bisa dadi sumber bebener kanggo tugas sing mlaku suwe. Njaga state lan kemajuan ing server tegese karya terus lumaku sanajan tab ilang. Protokol streaming lan sesi thread sing disimpen ndadekake sesi anyar gampang nyambung maneh, nerusake saka panggonan pungkasan, lan nyusul tanpa mbangun ulang state ing klien.

Sacara historis, TUI iku klien “native” sing mlaku ing proses sing padha karo loop agen lan ngomong langsung karo jinis inti Rust tinimbang protokol app-server. Iki nggawe iterasi awal cepet, nanging uga nggawe TUI dadi antarmuka kasus khusus.
Saiki amarga App Server wis ana, kita ngrancang ngrefaktor TUI(mbukak ing jendhela anyar) supaya nggunakake iki, supaya tumindake kaya klien liyane: mbukak proses anak App Server, nganggo JSON-RPC liwat stdio, lan nampilake acara streaming lan persetujuan sing padha. Iki mbukak alur kerja nalika TUI bisa nyambung menyang server Codex sing mlaku ing mesin adoh, njaga agen tetep cedhak karo komputasi lan nerusake karya sanajan laptop turu utawa pedhot, nalika isih ngirim nganyari langsung lan kontrol sacara lokal.
Codex App Server bakal dadi metode integrasi kelas utama sing bakal terus kita rawat, nanging ana uga metode liya kanthi fungsi luwih winates. Kanthi gawan, kita nyaranake klien nggunakake Codex App Server kanggo integrasi karo Codex, nanging migunani yen deleng macem-macem metode integrasi lan ngerti kaluwihan lan kekurangane. Ing ngisor iki cara sing paling umum kanggo ngoperasikake Codex lan kapan saben cara cocog digunakake.
Mbukak codex mcp-server(mbukak ing jendhela anyar) lan sambung saka klien MCP apa wae sing ndhukung server stdio (umpamane, OpenAI Agents SDK(mbukak ing jendhela anyar)). Iki cocog yen sampeyan wis duwe alur kerja adhedhasar MCP lan pengin nggunakke Codex minangka piranti sing bisa dipanggil. Kekurangane, sampeyan mung entuk apa sing diekspos MCP, mula interaksi khusus Codex sing gumantung marang semantik sesi sing luwih sugih (umpamane, nganyari diff) bisa uga ora kecocokan dipetakake liwat titik pungkasan MCP.
Sawetara ekosistem nawakake antarmuka portabel sing bisa nargetake akeh panyedhiya model lan runtime. Iki bisa dadi pilihan apik yen sampeyan pengin siji abstraksi sing ngkoordinasi akeh agen. Timbal balike, protokol iki asring ngumpul ing subset kemampuan sing umum, sing bisa ndadekake interaksi sing luwih sugih luwih angel diwakili, utamane nalika semantik piranti lan sesi sing khusus panyedhiya penting. Ruang iki berkembang cepet, lan kita ngarepake standar umum liyane bakal muncul nalika kita nemokake primitif paling apik kanggo makili alur kerja agen ing donya nyata (skills(mbukak ing jendhela anyar) iku conto apik).
Pilih App Server yen sampeyan pengin harness Codex lengkap diekspos minangka aliran acara stabil sing ramah UI. Sampeyan entuk fungsionalitas lengkap saka loop agen lan fitur panyengkuyung liyane kaya Sign in with ChatGPT, penemuan model, lan manajemen konfigurasi. Biaya utamane yaiku gaweyan integrasi, amarga sampeyan kudu mbangun binding JSON-RPC sisi klien ing basa sampeyan. Nanging ing praktik, Codex bisa nindakake akeh gaweyan abot yen sampeyan menehi skema JSON lan dokumentasi. Akeh tim sing kerja bareng karo kita bisa cepet nggawe integrasi sing bisa mlaku nggunakake Codex.
Mode CLI entheng lan bisa diskip kanggo tugas sepisan lan eksekusi CI. Iki cocog kanggo otomatisasi lan pipeline nalika sampeyan pengin siji printah mlaku nganti rampung tanpa interaksi, ngstream output terstruktur kanggo log, lan metu kanthi sinyal sukses utawa gagal sing cetha.
Pustaka TypeScript kanggo ngontrol agen Codex lokal kanthi programatik saka aplikasi sampeyan dhewe. Iki paling pas nalika sampeyan pengin antarmuka pustaka native kanggo piranti lan alur kerja sisi server tanpa mbangun klien JSON-RPC kapisah. Amarga dirilis luwih dhisik tinimbang App Server, saiki dhukunge luwih sithik kanggo basa lan area antarmuka luwih cilik. Yen ana minat saka pangembang, kita bisa nambah SDK tambahan sing mbungkus protokol App Server supaya tim bisa nutupi luwih akeh permukaan harness tanpa nulis binding JSON-RPC.
Ing kiriman iki, kita nuduhake carane kita nyedhaki rancangan standar anyar kanggo sesambungan karo agen lan carane ngowahi harness Codex dadi protokol stabil sing ramah klien. Kita mbahas carane App Server mbukak inti Codex, ngidini klien ngoperasikake loop agen lengkap, lan ndhukung macem-macem antarmuka kalebu TUI, integrasi IDE lokal, lan runtime web.
Yen iki menehi gagasan kanggo ngintegrasi Codex menyang alur kerja sampeyan dhewe, App Server pantes dicoba. Kabeh kode sumber ana ing repo(mbukak ing jendhela anyar) open-source Codex CLI. Mangga enggo nuduhake umpan balik lan panyuwunan fitur. Kita seneng krungu saka sampeyan lan terus nggawe agen luwih gampang diakses kanggo kabeh wong.
Pangarang
Ucapan matur nuwun
Matur nuwun khusus kanggo Michael Bolin, Owen Lin, Eric Traut, lan Rasmus Rygaard, sing wis nyumbang kanggo kiriman iki, lan uga kanggo kabeh tim Codex sing nggarap App Server.
Cathetan sikil
- 1
Kita nggunakake varian “JSON‑RPC lite”: iki njaga wujud request/response/notification, nanging ora nganggo header
"jsonrpc": "2.0"lan dibingkai minangka JSONL liwat stdio tinimbang JSON‑RPC 2.0 sing ketat. - 2
“stdio” nuduhake stdin/stdout app-server ing njero kontainer. Ing setup sing di-host, aliran kasebut asring ditunnel liwat sambungan jaringan sing ajeg (umpamane, kaya WebSocket) menyang runtime kontainer—mula tumindake kaya stdio sanajan dudu pipe lokal literal.


