OpenAI 研究 | 出版

研究

切换卡片以显示媒体

切换卡片以隐藏媒体

刊发

2026年8月1日

Ten advances in mathematics and theoretical computer science

OpenAI shares new results on long-standing open problems in mathematics and theoretical computer science, including advances in geometry, cryptography, and complexity.

研究

2026年7月29日

开启两项设置后，我们在 ARC-AGI-3 基准测试中的得分增至三倍

两项 API 设置如何通过保留推理和启用压缩，提高 GPT-5.6 在 ARC-AGI-3 上的得分与效率。

刊发

2026年7月28日

智能体 AI 时代的科学计算

一份新的实地报告展示科学家如何使用 AI 编程智能体实现科学计算现代化，加速基因组学等领域的软件开发与科学发现。

安全

2026年7月15日

GPT-Red：解锁稳健性自我优化能力

了解 GPT-Red：OpenAI 的自动化红队测试系统，通过自我博弈提升 AI 安全性、对齐能力和提示注入稳健性。

安全

2026年7月9日

GPT‑5.6 System Card

GPT-5.6 is a new family of three models: Sol, our new flagship model; Terra, a capable lower-cost option; and Luna, our fastest and most cost-efficient model. The safeguards we have built for this launch—our most robust yet—are built to deliver these models safely and at scale, around the world.

研究

2026年7月8日

剥离编程评估中的噪音，提取真实信号

OpenAI 的一项最新分析揭示了流行编程基准测试 SWE-Bench Pro 中存在的问题，引发了对 AI 模型评估可靠性与准确性的担忧。

安全

2026年7月8日

GPT‑Live System Card

GPT-Live-1 and GPT-Live-1 mini are a new generation of voice models designed to make conversations with AI feel more natural and intelligent.

研究

2026年6月30日

推出 GeneBench-Pro

推出 GeneBench-Pro：一项全新的基准测试，旨在利用复杂、真实的现实世界数据集，评估 AI 在基因组学、生物学及科学研究领域的性能表现。

安全

2026年6月26日

GPT‑5.6 Preview System Card

GPT-5.6 is a new family of three models: Sol, our new flagship model; Terra, a capable lower-cost option; and Luna, our fastest and most cost-efficient model. The safeguards we have built for this launch – our most robust yet – are built to deliver these models safely and at scale, around the world.