OpenAI の研究 | 出版

研究

カードを切り替えてメディアを表示

カードを切り替えてメディアを非表示

安全性

2026年7月15日

GPT-Red：堅牢性向上に向けた自己改善を実現

自己対戦を使って AI の安全性、アラインメント、プロンプトインジェクションへの堅牢性を高める、OpenAI の自動レッドチーミングシステム GPT-Red を紹介します。

安全性

2026年7月9日

GPT‑5.6 System Card

GPT-5.6 is a new family of three models: Sol, our new flagship model; Terra, a capable lower-cost option; and Luna, our fastest and most cost-efficient model. The safeguards we have built for this launch—our most robust yet—are built to deliver these models safely and at scale, around the world.

研究

2026年7月8日

コーディング評価における信号とノイズの切り分け

OpenAI の新たな分析により、人気のコーディングベンチマーク SWE-Bench Pro に問題があることが明らかになり、AI モデル評価の信頼性と正確性に懸念が生じています。

安全性

2026年7月8日

GPT‑Live System Card

GPT-Live-1 and GPT-Live-1 mini are a new generation of voice models designed to make conversations with AI feel more natural and intelligent.

研究

2026年6月30日

GeneBench-Pro のご紹介

複雑な実世界のデータセットを用いて、ゲノミクス、生物学、科学研究におけるAIの性能を評価する新しいベンチマーク、GeneBench-Pro をご紹介します。

安全性

2026年6月26日

GPT‑5.6 Preview System Card

GPT-5.6 is a new family of three models: Sol, our new flagship model; Terra, a capable lower-cost option; and Luna, our fastest and most cost-efficient model. The safeguards we have built for this launch – our most robust yet – are built to deliver these models safely and at scale, around the world.

研究

2026年6月17日

準自律型 AI 化学者による医薬品化学の難反応の改善

OpenAI と Molecule.one は、GPT-5.4 を活用する準自律型AI化学者が、医薬品合成における重要な反応を改善し、医薬化学研究を前進させた事例を紹介します。

研究

2026年6月17日

LifeSciBench のご紹介

LifeSciBench のご紹介。専門家が作成・レビューした、AI システムが現実のライフサイエンス研究タスクと意思決定にどう対応するかを評価するベンチマークです。

安全性

2026年5月5日

GPT-5.5 Instant System Card