Skip to main content

October 3, 2024

Introducing canvas

A new way of working with ChatGPT to write and code

The image shows a vertical toolbar featuring five icons arranged in a column on a soft pastel background. The third icon from the top, depicting an open book, is highlighted with a label next to it reading "Reading Level."

We’re introducing canvas, a new interface for working with ChatGPT on writing and coding projects that go beyond simple chat. Canvas opens in a separate window, allowing you and ChatGPT to collaborate on a project. This early beta introduces a new way of working together—not just through conversation, but by creating and refining ideas side by side.   

Canvas was built with GPT-4o and can be manually selected in the model picker while in beta. Starting today we’re rolling out canvas to ChatGPT Plus and Team users globally. Enterprise and Edu users will get access next week. We also plan to make canvas available to all ChatGPT Free users when it’s out of beta.

Better collaboration with ChatGPT

People use ChatGPT every day for help with writing and code. Although the chat interface is easy to use and works well for many tasks, it’s limited when you want to work on projects that require editing and revisions. Canvas offers a new interface for this kind of work.

With canvas, ChatGPT can better understand the context of what you’re trying to accomplish. You can highlight specific sections to indicate exactly what you want ChatGPT to focus on. Like a copy editor or code reviewer, it can give inline feedback and suggestions with the entire project in mind.

You control the project in canvas. You can directly edit text or code. There’s a menu of shortcuts for you to ask ChatGPT to adjust writing length, debug your code, and quickly perform other useful actions. You can also restore previous versions of your work by using the back button in canvas.

Canvas opens automatically when ChatGPT detects a scenario in which it could be helpful. You can also include “use canvas” in your prompt to open canvas and use it to work on an existing project.

Writing shortcuts include:

  • Suggest edits: ChatGPT offers inline suggestions and feedback.

  • Adjust the length: Edits the document length to be shorter or longer.

  • Change reading level: Adjusts the reading level, from Kindergarten to Graduate School.

  • Add final polish: Checks for grammar, clarity, and consistency.

  • Add emojis: Adds relevant emojis for emphasis and color.

Coding in canvas

Coding is an iterative process, and it can be hard to follow all the revisions to your code in chat. Canvas makes it easier to track and understand ChatGPT’s changes, and we plan to continue improving transparency into these kinds of edits.

Coding shortcuts include:

  • Review code: ChatGPT provides inline suggestions to improve your code.

  • Add logs: Inserts print statements to help you debug and understand your code.

  • Add comments: Adds comments to the code to make it easier to understand.

  • Fix bugs: Detects and rewrites problematic code to resolve errors.

  • Port to a language: Translates your code into JavaScript, TypeScript, Python, Java, C++, or PHP.

Training the model to become a collaborator

We trained GPT-4o to collaborate as a creative partner. The model knows when to open a canvas, make targeted edits, and fully rewrite. It also understands broader context to provide precise feedback and suggestions.

To support this, our research team developed the following core behaviors:

  • Triggering the canvas for writing and coding 

  • Generating diverse content types

  • Making targeted edits

  • Rewriting documents

  • Providing inline critique

We measured progress with over 20 automated internal evaluations. We used novel synthetic data generation techniques, such as distilling outputs from OpenAI o1-preview, to post-train the model for its core behaviors. This approach allowed us to rapidly address writing quality and new user interactions, all without relying on human-generated data.

A key challenge was defining when to trigger a canvas. We taught the model to open a canvas for prompts like “Write a blog post about the history of coffee beans” while avoiding over-triggering for general Q&A tasks like “Help me cook a new recipe for dinner.” For writing tasks, we prioritized improving “correct triggers” (at the expense of “correct non-triggers”), reaching 83% compared to a baseline zero-shot GPT-4o with prompted instructions.

It’s worth noting that the quality of such baselines is highly sensitive to the specific prompt used. With different prompts, the baseline may still perform poorly but in a different manner—for instance, by being evenly inaccurate across coding and writing tasks, resulting in a different distribution of errors and alternative forms of suboptimal performance. For coding, we intentionally biased the model against triggering to avoid disrupting our power users. We'll continue refining this based on user feedback.

Canvas Decision Boundary Trigger - Writing & Coding
Prompted GPT-4o
GPT-4o with canvas
0Writing (correct trigger canvas)0Writing (correct don’t trigger canvas)0Coding (correct don’t trigger canvas)0Coding (correct trigger canvas)0.000.100.200.300.400.500.600.700.800.901.00

For writing and coding tasks, we improved correctly triggering the canvas decision boundary, reaching 83% and 94% respectively compared to a baseline zero-shot GPT-4o with prompted instructions.

A second challenge involved tuning the model's editing behavior once the canvas was triggered—specifically deciding when to make a targeted edit versus rewriting the entire content. We trained the model to perform targeted edits when users explicitly select text through the interface, otherwise favoring rewrites. This behavior continues to evolve as we refine the model.

Canvas Edits Boundary - Writing & Coding
Prompted GPT-4o
GPT-4o with canvas
Canvas full rewriteCanvas targeted edit0.00.10.20.30.40.50.60.70.80.91.0

For writing and coding tasks, we prioritized improving canvas targeted edits. GPT-4o with canvas performs better than a baseline prompted GPT-4o by 18%.

Finally, training the model to generate high-quality comments required careful iteration. Unlike the first two cases, which are easily adaptable to automated evaluation with thorough manual reviews, measuring quality in an automated way is particularly challenging. Therefore, we used human evaluations to assess comment quality and accuracy. Our integrated canvas model outperforms the zero-shot GPT-4o with prompted instructions by 30% in accuracy and 16% in quality, showing that synthetic training significantly enhances response quality and behavior compared to zero-shot prompting with detailed instructions.

Canvas Suggested Comments
Prompted GPT-4o
GPT-4o with canvas
Comment’s Triggering CorrectnessQuality of Suggested Comments0.00.10.20.30.40.50.60.70.80.91.0

Human evaluations assessed canvas comment quality and accuracy functionality. Our canvas model outperforms the zero-shot GPT-4o with prompted instructions by 30% in accuracy and 16% in quality.

What’s next

Making AI more useful and accessible requires rethinking how we interact with it. Canvas is a new approach and the first major update to ChatGPT’s visual interface since we launched two years ago.

Canvas is in early beta, and we plan to rapidly improve its capabilities.

Author

OpenAI

Research Lead

Karina Nguyen

Core Research

Kai Chen, Michael Wu, Tarun Gogineni

Core Engineering, Product, Design

Alexi Christakis, Bryan Ashley, Bryant Jow, Chris Haugli, Daniel Levine, Eric Jiang, Gabriel Peal, Lee Byron, Lukas Gross, Matt Lim, Sara Culver, Thomas Dimson

Contributors

Andrew Gibiansky, Andrew Howell, Arianna McClain, David Li, Doug Li, Ilya Kostrikov, Katy Shi, Noah Deutsch, Randall Lin, Sara Culver, Sean Fitzgerald, Shuaiqi Xia, Spencer Papay, Thomas Shadwell, Valerie Qi, Xiaolin Hao, Yilei Qian

Supporting Leadership

Akshay Nathan, Barret Zoph, Ian Silber, Joanne Jang, John Schulman, Kevin Weil, Mia Glaese, Mira Murati, Nick Turley, Sam Altman, Sulman Choudhry