OpenAI 研究 | 刊物

研究

切換卡以顯示媒體

切換卡以隱藏媒體

刊物

2026年8月1日

Ten advances in mathematics and theoretical computer science

OpenAI shares new results on long-standing open problems in mathematics and theoretical computer science, including advances in geometry, cryptography, and complexity.

研究

2026年7月29日

啟用兩項設定，令 ARC-AGI-3 基準測試得分增至三倍

兩項 API 設定如何透過保留推理和啟用上下文壓縮，提高 GPT-5.6 在 ARC-AGI-3 的得分和效率。

刊物

2026年7月28日

智能代理式 AI 時代的科學運算

一份新實地報告展示科學家如何運用 AI 編碼智能代理革新科學運算，加快基因組學及其他領域的軟件開發與科學發現。

安全

2026年7月15日

GPT-Red：解鎖提升穩健性的自我改進能力

了解 GPT-Red：OpenAI 的自動化紅隊演練系統，透過自我對弈提升 AI 安全、對齊程度，以及抵禦提示注入攻擊的能力。

安全

2026年7月9日

GPT‑5.6 System Card

GPT-5.6 is a new family of three models: Sol, our new flagship model; Terra, a capable lower-cost option; and Luna, our fastest and most cost-efficient model. The safeguards we have built for this launch—our most robust yet—are built to deliver these models safely and at scale, around the world.

研究

2026年7月8日

在編碼評估中分辨訊號與雜訊

OpenAI 的新分析揭示熱門編碼基準 SWE-Bench Pro 存在問題，引發對評估 AI 模型可靠性和準確性的疑慮。

安全

2026年7月8日

GPT‑Live System Card

GPT-Live-1 and GPT-Live-1 mini are a new generation of voice models designed to make conversations with AI feel more natural and intelligent.

研究

2026年6月30日

GeneBench-Pro 登場

GeneBench-Pro 是一項全新基準測試，採用複雜的真實世界資料集，評估 AI 在基因組學、生物學和科學研究方面的表現。

安全

2026年6月26日

GPT‑5.6 Preview System Card

GPT-5.6 is a new family of three models: Sol, our new flagship model; Terra, a capable lower-cost option; and Luna, our fastest and most cost-efficient model. The safeguards we have built for this launch – our most robust yet – are built to deliver these models safely and at scale, around the world.