OpenAI Research

Research

Switch cards to show Media

Switch cards to hide Media

Research

Jul 8, 2026

Separating signal from noise in coding evaluations

A new analysis from OpenAI reveals issues in SWE-Bench Pro, a popular coding benchmark, raising concerns about reliability and accuracy in evaluating AI models.

Product

Jul 8, 2026

Introducing GPT-Live

A new generation of voice models for natural human-AI interaction, now powering ChatGPT Voice.

Safety

Jul 8, 2026

GPT‑Live System Card

GPT-Live-1 and GPT-Live-1 mini are a new generation of voice models designed to make conversations with AI feel more natural and intelligent.

Research

Jun 30, 2026

Introducing GeneBench-Pro

Introducing GeneBench-Pro, a new benchmark testing AI performance in genomics, biology, and scientific research using complex, real-world datasets.

Product

Jun 26, 2026

Previewing GPT-5.6 Sol: a next-generation model

OpenAI previews GPT-5.6 Sol, a next-generation model with stronger capabilities in coding, science, and cybersecurity, paired with its most advanced safety stack.

Safety

Jun 26, 2026

GPT‑5.6 Preview System Card

GPT-5.6 is a new family of three models: Sol, our new flagship model; Terra, a capable lower-cost option; and Luna, our fastest and most cost-efficient model. The safeguards we have built for this launch – our most robust yet – are built to deliver these models safely and at scale, around the world.

Research

Jun 17, 2026

A near-autonomous AI chemist improves a challenging reaction in medicinal chemistry

OpenAI and Molecule.one show how a near-autonomous AI chemist using GPT-5.4 improved a key drug-making reaction, advancing medicinal chemistry research.

Research

Jun 17, 2026

Introducing LifeSciBench

Introducing LifeSciBench, an expert-authored, expert-reviewed benchmark for evaluating how AI systems handle real-world life science research tasks and decisions.

Research

Jun 4, 2026

Dreaming: Better memory for a more helpful ChatGPT

ChatGPT introduces a new memory system to better remember preferences, keeping context fresh and relevant across conversations.