Skip to main content

August 6, 2024

Introducing Structured Outputs in the API

We are introducing Structured Outputs in the API—model outputs now reliably adhere to developer-supplied JSON Schemas.

The image shows an abstract pattern of small squares in varying shades of blue, green, and light yellow. The squares are arranged in a grid-like formation, creating a mosaic effect with a soft, pastel color palette.

Last year at DevDay, we introduced JSON mode—a useful building block for developers looking to build reliable applications with our models. While JSON mode improves model reliability for generating valid JSON outputs, it does not guarantee that the model’s response will conform to a particular schema. Today we’re introducing Structured Outputs in the API, a new feature designed to ensure model-generated outputs will exactly match JSON Schemas provided by developers.

Generating structured data from unstructured inputs is one of the core use cases for AI in today’s applications. Developers use the OpenAI API to build powerful assistants that have the ability to fetch data and answer questions via function calling(opens in a new window), extract structured data for data entry, and build multi-step agentic workflows that allow LLMs to take actions. Developers have long been working around the limitations of LLMs in this area via open source tooling, prompting, and retrying requests repeatedly to ensure that model outputs match the formats needed to interoperate with their systems. Structured Outputs solves this problem by constraining OpenAI models to match developer-supplied schemas and by training our models to better understand complicated schemas.

On our evals of complex JSON schema following, our new model gpt-4o-2024-08-06 with Structured Outputs scores a perfect 100%. In comparison, gpt-4-0613 scores less than 40%.

Prompting Alone
Structured Outputs (strict=false)
Structured Outputs (strict=true)
gpt-4-0613gpt-4-turbo-2024-04-09gpt-4o-2024-05-13gpt-4o-2024-08-060102030405060708090100

With Structured Outputs, gpt-4o-2024-08-06 achieves 100% reliability in our evals, perfectly matching the output schemas.

How to use Structured Outputs

We’re introducing Structured Outputs in two forms in the API: 

1. Function calling: Structured Outputs via tools is available by setting strict: true within your function definition. This feature works with all models that support tools, including all models gpt-4-0613 and gpt-3.5-turbo-0613 and later. When Structured Outputs are enabled, model outputs will match the supplied tool definition.

JSON
1
POST /v1/chat/completions
2
{
3
"model": "gpt-4o-2024-08-06",
4
"messages": [
5
{
6
"role": "system",
7
"content": "You are a helpful assistant. The current date is August 6, 2024. You help users query for the data they are looking for by calling the query function."
8
},
9
{
10
"role": "user",
11
"content": "look up all my orders in may of last year that were fulfilled but not delivered on time"
12
}
13
],
14
"tools": [
15
{
16
"type": "function",
17
"function": {
18
"name": "query",
19
"description": "Execute a query.",
20
"strict": true,
21
"parameters": {
22
"type": "object",
23
"properties": {
24
"table_name": {
25
"type": "string",
26
"enum": ["orders"]
27
},
28
"columns": {
29
"type": "array",
30
"items": {
31
"type": "string",
32
"enum": [
33
"id",
34
"status",
35
"expected_delivery_date",
36
"delivered_at",
37
"shipped_at",
38
"ordered_at",
39
"canceled_at"
40
]
41
}
42
},
43
"conditions": {
44
"type": "array",
45
"items": {
46
"type": "object",
47
"properties": {
48
"column": {
49
"type": "string"
50
},
51
"operator": {
52
"type": "string",
53
"enum": ["=", ">", "<", ">=", "<=", "!="]
54
},
55
"value": {
56
"anyOf": [
57
{
58
"type": "string"
59
},
60
{
61
"type": "number"
62
},
63
{
64
"type": "object",
65
"properties": {
66
"column_name": {
67
"type": "string"
68
}
69
},
70
"required": ["column_name"],
71
"additionalProperties": false
72
}
73
]
74
}
75
},
76
"required": ["column", "operator", "value"],
77
"additionalProperties": false
78
}
79
},
80
"order_by": {
81
"type": "string",
82
"enum": ["asc", "desc"]
83
}
84
},
85
"required": ["table_name", "columns", "conditions", "order_by"],
86
"additionalProperties": false
87
}
88
}
89
}
90
]
91
}

2. A new option for the response_format parameter: developers can now supply a JSON Schema via json_schema, a new option for the response_format parameter. This is useful when the model is not calling a tool, but rather, responding to the user in a structured way. This feature works with our newest GPT-4o models: gpt-4o-2024-08-06, released today, and gpt-4o-mini-2024-07-18. When a response_format is supplied with strict: true, model outputs will match the supplied schema.

Request
1
POST /v1/chat/completions
2
{
3
"model": "gpt-4o-2024-08-06",
4
"messages": [
5
{
6
"role": "system",
7
"content": "You are a helpful math tutor."
8
},
9
{
10
"role": "user",
11
"content": "solve 8x + 31 = 2"
12
}
13
],
14
"response_format": {
15
"type": "json_schema",
16
"json_schema": {
17
"name": "math_response",
18
"strict": true,
19
"schema": {
20
"type": "object",
21
"properties": {
22
"steps": {
23
"type": "array",
24
"items": {
25
"type": "object",
26
"properties": {
27
"explanation": {
28
"type": "string"
29
},
30
"output": {
31
"type": "string"
32
}
33
},
34
"required": ["explanation", "output"],
35
"additionalProperties": false
36
}
37
},
38
"final_answer": {
39
"type": "string"
40
}
41
},
42
"required": ["steps", "final_answer"],
43
"additionalProperties": false
44
}
45
}
46
}
47
}

Safe Structured Outputs

Safety is a top priority for OpenAI—the new Structured Outputs functionality will abide by our existing safety policies and will still allow the model to refuse an unsafe request. To make development simpler, there is a new refusal string value on API responses which allows developers to programmatically detect if the model has generated a refusal instead of output matching the schema. When the response does not include a refusal and the model’s response has not been prematurely interrupted (as indicated by finish_reason), then the model’s response will reliably produce valid JSON matching the supplied schema.

JSON
1
{
2
"id": "chatcmpl-9nYAG9LPNonX8DAyrkwYfemr3C8HC",
3
"object": "chat.completion",
4
"created": 1721596428,
5
"model": "gpt-4o-2024-08-06",
6
"choices": [
7
{
8
"index": 0,
9
"message": {
10
"role": "assistant",
11
"refusal": "I'm sorry, I cannot assist with that request."
12
},
13
"logprobs": null,
14
"finish_reason": "stop"
15
}
16
],
17
"usage": {
18
"prompt_tokens": 81,
19
"completion_tokens": 11,
20
"total_tokens": 92
21
},
22
"system_fingerprint": "fp_3407719c7f"
23
}

Native SDK support

Our Python and Node SDKs have been updated with native support for Structured Outputs. Supplying a schema for tools or as a response format is as easy as supplying a Pydantic or Zod object, and our SDKs will handle converting the data type to a supported JSON schema, deserializing the JSON response into the typed data structure automatically, and parsing refusals if they arise.

The following examples show native support for Structured Outputs with function calling.

python
1
from enum import Enum
2
from typing import Union
3

4
from pydantic import BaseModel
5

6
import openai
7
from openai import OpenAI
8

9

10
class Table(str, Enum):
11
orders = "orders"
12
customers = "customers"
13
products = "products"
14

15

16
class Column(str, Enum):
17
id = "id"
18
status = "status"
19
expected_delivery_date = "expected_delivery_date"
20
delivered_at = "delivered_at"
21
shipped_at = "shipped_at"
22
ordered_at = "ordered_at"
23
canceled_at = "canceled_at"
24

25

26
class Operator(str, Enum):
27
eq = "="
28
gt = ">"
29
lt = "<"
30
le = "<="
31
ge = ">="
32
ne = "!="
33

34

35
class OrderBy(str, Enum):
36
asc = "asc"
37
desc = "desc"
38

39

40
class DynamicValue(BaseModel):
41
column_name: str
42

43

44
class Condition(BaseModel):
45
column: str
46
operator: Operator
47
value: Union[str, int, DynamicValue]
48

49

50
class Query(BaseModel):
51
table_name: Table
52
columns: list[Column]
53
conditions: list[Condition]
54
order_by: OrderBy
55

56

57
client = OpenAI()
58

59
completion = client.beta.chat.completions.parse(
60
model="gpt-4o-2024-08-06",
61
messages=[
62
{
63
"role": "system",
64
"content": "You are a helpful assistant. The current date is August 6, 2024. You help users query for the data they are looking for by calling the query function.",
65
},
66
{
67
"role": "user",
68
"content": "look up all my orders in may of last year that were fulfilled but not delivered on time",
69
},
70
],
71
tools=[
72
openai.pydantic_function_tool(Query),
73
],
74
)
75

76
print(completion.choices[0].message.tool_calls[0].function.parsed_arguments)

Native Structured Outputs support is also available for response_format.

python
1
from pydantic import BaseModel
2

3
from openai import OpenAI
4

5

6
class Step(BaseModel):
7
explanation: str
8
output: str
9

10

11
class MathResponse(BaseModel):
12
steps: list[Step]
13
final_answer: str
14

15

16
client = OpenAI()
17

18
completion = client.beta.chat.completions.parse(
19
model="gpt-4o-2024-08-06",
20
messages=[
21
{"role": "system", "content": "You are a helpful math tutor."},
22
{"role": "user", "content": "solve 8x + 31 = 2"},
23
],
24
response_format=MathResponse,
25
)
26

27
message = completion.choices[0].message
28
if message.parsed:
29
print(message.parsed.steps)
30
print(message.parsed.final_answer)
31
else:
32
print(message.refusal)

Additional use cases

Developers frequently use OpenAI’s models to generate structured data for various use cases. Some additional examples include:

Dynamically generating user interfaces based on the user’s intent

For example, developers can use Structured Outputs to create code- or UI-generating applications. All of the following examples use the same response_format, and can be used to generate varying UIs based on user input.

System
You are a user interface assistant. Your job is to help users visualize their website and app ideas.
Response format
Assistant
{
"type": "div",
"label": "",
"children": [
{
"type": "header",
"label": "",
"children": [
{
"type": "div",
"label": "Green Thumb Gardening",
"children": [],
"attributes": [{ "name": "className", "value": "site-title" }]
},
{
"type": "div",
"label": "Bringing Life to Your Garden",
"children": [],
"attributes": [{ "name": "className", "value": "site-tagline" }]
}
],
"attributes": [{ "name": "className", "value": "header" }]
},
{
"type": "section",
"label": "",
"children": [
{
"type": "div",
"label": "",
"children": [
{
"type": "div",
"label": "About Us",
"children": [
{
"type": "div",
"label": "At Green Thumb Gardening, we specialize in transforming your outdoor spaces into beautiful, thriving gardens. Our team has decades of experience in horticulture and landscape design.",
"children": [],
"attributes": [
{ "name": "className", "value": "about-description" }
]
}
],
"attributes": [{ "name": "className", "value": "about-section" }]
}
],
"attributes": [{ "name": "className", "value": "content" }]
}
],
"attributes": [{ "name": "className", "value": "about-container" }]
},
{
"type": "section",
"label": "",
"children": [
{
"type": "div",
"label": "",
"children": [
{
"type": "div",
"label": "Our Services",
"children": [
{
"type": "div",
"label": "Garden Design",
"children": [],
"attributes": [
{ "name": "className", "value": "service-item" }
]
},
{
"type": "div",
"label": "Plant Care & Maintenance",
"children": [],
"attributes": [
{ "name": "className", "value": "service-item" }
]
},
{
"type": "div",
"label": "Seasonal Cleanup",
"children": [],
"attributes": [
{ "name": "className", "value": "service-item" }
]
},
{
"type": "div",
"label": "Custom Landscaping",
"children": [],
"attributes": [
{ "name": "className", "value": "service-item" }
]
}
],
"attributes": [{ "name": "className", "value": "services-list" }]
}
],
"attributes": [{ "name": "className", "value": "content" }]
}
],
"attributes": [{ "name": "className", "value": "services-container" }]
}
],
"attributes": [{ "name": "className", "value": "landing-page" }]
}

Welcome to Green Thumb Gardening
Bringing Life to Your Garden
At Green Thumb Gardening, we specialize in transforming your outdoor spaces into beautiful, thriving gardens. Our team has decades of experience in horticulture and landscape design.
Our services
Garden Design
Plant Care & Maintenance
Seasonal Cleanup
Custom Landscaping

Separating a final answer from supporting reasoning or additional commentary

It can be useful to give the model a separate field for chain of thought to improve the final quality of the response.

JSON
1
{
2
"model": "gpt-4o-2024-08-06",
3
"messages": [
4
{
5
"role": "system",
6
"content": "You are a helpful assistant"
7
},
8
{
9
"role": "user",
10
"content": "9.11 and 9.9 -- which is bigger?"
11
}
12
],
13
"response_format": {
14
"type": "json_schema",
15
"json_schema": {
16
"name": "reasoning_schema",
17
"strict": true,
18
"schema": {
19
"type": "object",
20
"properties": {
21
"reasoning_steps": {
22
"type": "array",
23
"items": {
24
"type": "string"
25
},
26
"description": "The reasoning steps leading to the final conclusion."
27
},
28
"answer": {
29
"type": "string",
30
"description": "The final answer, taking into account the reasoning steps."
31
}
32
},
33
"required": ["reasoning_steps", "answer"],
34
"additionalProperties": false
35
}
36
}
37
}
38
}

Extracting structured data from unstructured data

For example, instructing the model to extract things like to-dos, due dates, and assignments from meeting notes.

JSON
1
POST /v1/chat/completions
2
{
3
"model": "gpt-4o-2024-08-06",
4
"messages": [
5
{
6
"role": "system",
7
"content": "Extract action items, due dates, and owners from meeting notes."
8
},
9
{
10
"role": "user",
11
"content": "...meeting notes go here..."
12
}
13
],
14
"response_format": {
15
"type": "json_schema",
16
"json_schema": {
17
"name": "action_items",
18
"strict": true,
19
"schema": {
20
"type": "object",
21
"properties": {
22
"action_items": {
23
"type": "array",
24
"items": {
25
"type": "object",
26
"properties": {
27
"description": {
28
"type": "string",
29
"description": "Description of the action item."
30
},
31
"due_date": {
32
"type": ["string", "null"],
33
"description": "Due date for the action item, can be null if not specified."
34
},
35
"owner": {
36
"type": ["string", "null"],
37
"description": "Owner responsible for the action item, can be null if not specified."
38
}
39
},
40
"required": ["description", "due_date", "owner"],
41
"additionalProperties": false
42
},
43
"description": "List of action items from the meeting."
44
}
45
},
46
"required": ["action_items"],
47
"additionalProperties": false
48
}
49
}
50
}
51
}

Under the hood

We took a two part approach to improving reliability for model outputs that match JSON Schema. First, we trained our newest model gpt-4o-2024-08-06 to understand complicated schemas and how best to produce outputs that match them. However, model behavior is inherently non-deterministic—despite this model’s performance improvements (93% on our benchmark), it still did not meet the reliability that developers need to build robust applications. So we also took a deterministic, engineering-based approach to constrain the model’s outputs to achieve 100% reliability.

Constrained decoding

Our approach is based on a technique known as constrained sampling or constrained decoding. By default, when models are sampled to produce outputs, they are entirely unconstrained and can select any token from the vocabulary as the next output. This flexibility is what allows models to make mistakes; for example, they are generally free to sample a curly brace token at any time, even when that would not produce valid JSON. In order to force valid outputs, we constrain our models to only tokens that would be valid according to the supplied schema, rather than all available tokens.

It can be challenging to implement this constraining in practice, since the tokens that are valid differ throughout a model’s output. Let’s say we have the following schema:

JSON
1
{
2
"type": "object",
3
"properties": {
4
"value": { "type": "number" }
5
},
6
"required": ["value"],
7
"additionalProperties": false
8
}

The tokens that are valid at the beginning of the output include things like {, {“, {\n, etc. However, once the model has already sampled {“val, then { is no longer a valid token. Thus we need to implement dynamic constrained decoding, and determine which tokens are valid after each token is generated, rather than upfront at the beginning of the response.

To do this, we convert the supplied JSON Schema into a context-free grammar (CFG). A grammar is a set of rules that defines a language, and a context-free grammar is a grammar that conforms to specific rules. You can think of JSON and JSON Schema as particular languages with rules to define what is valid within the language. Just as it’s not valid in English to have a sentence with no verb, it is not valid in JSON to have a trailing comma.

Thus, for each JSON Schema, we compute a grammar that represents that schema, and pre-process its components to make it easily accessible during model sampling. This is why the first request with a new schema incurs a latency penalty—we must preprocess the schema to generate this artifact that we can use efficiently during sampling.

While sampling, after every token, our inference engine will determine which tokens are valid to be produced next based on the previously generated tokens and the rules within the grammar that indicate which tokens are valid next. We then use this list of tokens to mask the next sampling step, which effectively lowers the probability of invalid tokens to 0. Because we have preprocessed the schema, we can use a cached data structure to do this efficiently, with minimal latency overhead.

Alternate approaches

Alternate approaches to this problem often use finite state machines (FSMs) or regexes (generally implemented with FSMs) for constrained decoding. These function similarly in that they dynamically update which tokens are valid after each token is produced, but they have some key differences from the CFG approach. Notably, CFGs can express a broader class of languages than FSMs. In practice, this doesn’t matter for very simple schemas like the value schema shown above. However, we find that the difference is meaningful for more complex schemas that involve nested or recursive data structures. As an example, FSMs cannot generally express recursive types, which means FSM based approaches may struggle to match parentheses in deeply nested JSON. The following is a sample recursive schema that is supported on the OpenAI API with Structured Outputs but would not be possible to express with a FSM.

JSON
1
{
2
"name": "ui",
3
"description": "Dynamically generated UI",
4
"strict": true,
5
"schema": {
6
"type": "object",
7
"properties": {
8
"type": {
9
"type": "string",
10
"description": "The type of the UI component",
11
"enum": ["div", "button", "header", "section", "field", "form"]
12
},
13
"label": {
14
"type": "string",
15
"description": "The label of the UI component, used for buttons or form fields"
16
},
17
"children": {
18
"type": "array",
19
"description": "Nested UI components",
20
"items": {
21
"$ref": "#"
22
}
23
},
24
"attributes": {
25
"type": "array",
26
"description": "Arbitrary attributes for the UI component, suitable for any element",
27
"items": {
28
"type": "object",
29
"properties": {
30
"name": {
31
"type": "string",
32
"description": "The name of the attribute, for example onClick or className"
33
},
34
"value": {
35
"type": "string",
36
"description": "The value of the attribute"
37
}
38
}
39
}
40
}
41
},
42
"required": ["type", "label", "children", "attributes"],
43
"additionalProperties": false
44
}
45
}

Note that each UI element can have arbitrary children which reference the root schema recursively. This flexibility is something that the CFG approach affords.

Limitations and restrictions

There are a few limitations to keep in mind when using Structured Outputs:

  • Structured Outputs allows only a subset of JSON Schema, detailed in our docs(opens in a new window). This helps us ensure the best possible performance.

  • The first API response with a new schema will incur additional latency, but subsequent responses will be fast with no latency penalty. This is because during the first request, we process the schema as indicated above and then cache these artifacts for fast reuse later on. Typical schemas take under 10 seconds to process on the first request, but more complex schemas may take up to a minute.

  • The model can fail to follow the schema if the model chooses to refuse an unsafe request. If it chooses to refuse, the return message will have the refusal boolean set to true to indicate this. 

  • The model can fail to follow the schema if the generation reaches max_tokens or another stop condition before finishing. 

  • Structured Outputs doesn’t prevent all kinds of model mistakes. For example, the model may still make mistakes within the values of the JSON object (e.g., getting a step wrong in a mathematical equation). If developers find mistakes, we recommend providing examples in the system instructions or splitting tasks into simpler subtasks.

  • Structured Outputs is not compatible with parallel function calls. When a parallel function call is generated, it may not match supplied schemas. Set parallel_tool_calls: false to disable parallel function calling.

  • JSON Schemas supplied with Structured Outputs aren’t Zero Data Retention(opens in a new window) (ZDR) eligible.

Availability

Structured Outputs is generally available today in the API. 

Structured Outputs with function calling is available on all models that support function calling in the API. This includes our newest models (
gpt-4o, gpt-4o-mini), all models after and including gpt-4-0613 and gpt-3.5-turbo-0613, and any fine-tuned models that support function calling. This functionality is available on the Chat Completions API, Assistants API, and Batch API. Structured Outputs with function calling is also compatible with vision inputs.

Structured Outputs with response formats is available on
gpt-4o-mini and gpt-4o-2024-08-06 and any fine tunes based on these models. This functionality is available on the Chat Completions API, Assistants API, and Batch API. Structured Outputs with response formats is also compatible with vision inputs. 

By switching to the new
gpt-4o-2024-08-06, developers save 50% on inputs ($2.50/1M input tokens) and 33% on outputs ($10.00/1M output tokens) compared to gpt-4o-2024-05-13.

To start using Structured Outputs, check out our
docs(opens in a new window)

Acknowledgements

Structured Outputs takes inspiration from excellent work from the open source community: namely, the outlines(opens in a new window), jsonformer(opens in a new window), instructor(opens in a new window), guidance(opens in a new window), and lark(opens in a new window) libraries.

Core contributors

Chris Colby, Melody Guan, Michelle Pokrass, Ted Sanders, Brian Zhang

Acknowledgments

John Allard, Filipe de Avila Belbute Peres, Ilan Bigio, Owen Campbell-Moore, Chen Ding, Atty Eleti, Elie Georges, Katia Gil Guzman, Jeff Harris, Johannes Heidecke, Beth Hoover, Romain Huet, Tomer Kaftan, Jillian Khoo, Karolis Kosas, Ryan Liu, Kevin Lu, Lindsay McCallum, Rohan Nuttall, Joe Palermo, Leher Pathak, Ishaan Singal, Felipe Petroski Such, Freddie Sulit, David Weedon