ChatGPT agent System Card: OpenAI’s agentic model unites research, browser automation, and code tools with safeguards under the Preparedness Framework.
Introducing ChatGPT agent: it thinks and acts, using tools to complete tasks like research, bookings, and slideshows—all with your guidance.
We study how training on incorrect responses can cause broader misalignment in language models and identify an internal feature driving this behavior—one that can be reversed with minimal fine-tuning.
We are replacing the existing GPT-4o-based model for Operator with a version based on OpenAI o3. The API version will remain based on 4o.
Codex is a cloud-based coding agent. Codex is powered by codex-1, a version of OpenAI o3 optimized for software engineering. codex-1 was trained using reinforcement learning on real-world coding tasks in a variety of environments to generate code that closely mirrors human style and PR preferences, adheres precisely to instructions, and iteratively runs tests until passing results are achieved.
Codex 简介:基于云的软件工程智能体,可并行处理多项任务,由 Codex-1 提供支持。有了 Codex,开发人员可以同时部署多个智能体来独立处理编码任务,如编写功能、回答有关代码库的问题、修复错误以及提出拉取请求以供审核。
HealthBench is a new evaluation benchmark for AI in healthcare which evaluates models in realistic scenarios. Built with input from 250+ physicians, it aims to provide a shared standard for model performance and safety in health.
通过在思维链中对图像进行推理,OpenAI o3 和 o4-mini 在视觉感知方面取得了重大突破。
OpenAI o3 和 OpenAI o4-mini 模型综合了最先进的推理能力与完整的工具功能——网页浏览、Python、图像和文件分析、图像生成、画布、自动化、文件搜索以及记忆功能。