2026年6月18日

提升 ChatGPT 的健康智慧

GPT‑5.5 Instant 將前沿健康智慧帶給更多人，背後有我們模型的進展與醫師主導的評估共同推動。

載入中…

健康是人們使用 ChatGPT 最有意義的方式之一。每週有超過 2.3 億人向 ChatGPT 尋求健康與保健問題的協助：理解健康資訊、看懂檢驗結果、準備就醫、處理保險事宜、養成更健康的習慣，以及釐清接下來該問什麼。

透過 GPT‑5.5 Instant，我們看到模型在處理健康問題上有了重大進展，包括更能辨識何時可能需要緊急就醫、詢問相關背景、說明不確定性，並讓複雜資訊更容易理解。在我們最具挑戰性的健康評估中，GPT‑5.5 Instant 目前的表現已達到與我們前沿 Thinking 模型相當的水準。由於 ChatGPT 免費使用者也能使用此模型，更多人可以受惠於這些改進。這讓使用者更容易理解健康資訊、知道該提出哪些問題，也能清楚了解下一步該怎麼做。

這項進展同時反映了模型能力的提升，以及由醫師主導、支撐我們健康評估的工作。在我們的各項工作中，全球醫師網路會審查模型回覆範例、描述理想行為並找出失敗模式，協助定義真實健康情境中「好」的表現應是什麼樣子。與醫師合作，讓我們能衡量健康領域的進展，並逐步改進 ChatGPT 的回應方式。

衡量健康領域的進展

在健康領域，進展意味著提供準確、易懂且以良好判斷為基礎的回覆：能辨識何時需要更多背景資訊、坦誠說明不確定之處，並協助人們理解何時應尋求醫療照護。

為了衡量這項進展，我們使用健康專屬評估，包括 HealthBench 和 HealthBench Professional。這些評估使用逼真的健康對話和醫師撰寫的評分規準，評估準確性、安全性、溝通、情境理解、完整性與適當升級處理等特質。

在多項健康評估（包括 HealthBench Professional）的彙總結果中，GPT‑5.5 Instant 的健康表現達到與我們最新前沿模型相近的水準，較 GPT‑5.3 Instant 大幅提升。5.5 Instant（2026 年 5 月發布）和 5.3 Instant（2026 年 3 月發布）開放給 ChatGPT 的所有免費使用者使用（受限額限制），而我們使用 API 定價來計算 5.4 Thinking 和 5.5 Thinking 的成本。

作為另一項比較，我們也請醫師在不限時間且可使用網際網路（但不可使用 AI）的情況下，為具代表性的健康對話撰寫回覆。接著，由另一組醫師比較這些醫師撰寫的回覆與不同時期的 Instant 模型，審查真實互動中重要的特質，包括準確性、溝通、完整性、遵循指示，以及對健康決策的幫助；共審查了 3500 則回覆。

在這項評估中，GPT‑5.5 Instant 的回覆在各項維度上都比醫師撰寫的回覆和舊版模型回覆獲得更高評分。

醫師評定 GPT‑5.5 Instant 的回覆比舊版模型和醫師的回覆更少出現失敗模式。例如，與舊版模型和醫師相比，GPT 5.5 Instant 較少出現未依在地醫療情境調整、遺漏警訊或未建議就醫，以及在需要時未向使用者詢問更多背景資訊等情況。

考量到我們的模型在健康領域的使用規模，另一種理解近期模型改進的方式，是衡量正式環境中的流量。我們在正式環境流量上使用保護隱私的監測機制，追蹤健康回覆中可能的事實性問題。根據近期健康領域正式環境流量（每週數十億則訊息）的比較，含有至少一項被標記事實性問題的回覆比例，在過去兩個月下降了 71%。

更好的回覆有哪些特點

比較不同時間點的模型對真實健康問題所給出的回覆，可以看出 ChatGPT 在健康領域關鍵面向上的改進：辨識哪些情況可能需要緊急處理、以更好的判斷面對不確定性，並提供更清楚、更有用的下一步指引。

Why might a doctor recommend an MRI before a steroid injection for sciatica?

GPT-5.2 Instant

A doctor may recommend an MRI before a steroid injection for sciatica because the injection is more likely to be useful—and safer—when they know exactly what is compressing or irritating the nerve and where.

Common reasons include:

Confirming the cause of sciatica. Sciatica-like pain can come from a herniated disc, spinal stenosis, foraminal narrowing, spondylolisthesis, tumor, infection, fracture, or sometimes non-spine causes. MRI is the preferred imaging test for seeing discs, nerves, and soft tissue nerve compression. (emedicine.medscape.com)
Choosing the correct injection level and side. If symptoms suggest the L5 nerve root, for example, the MRI can show whether the problem is actually at L4–L5, L5–S1, or another level. This helps decide where to place the epidural or selective nerve-root injection.
Selecting the safest/most appropriate injection approach. Epidural steroid injections can be done by different routes—interlaminar, transforaminal, or caudal. Pre-procedure imaging helps the clinician plan the route and avoid anatomic issues that could increase risk. Many interventionalists consider advanced imaging a prerequisite because it helps optimize medication delivery and reduce risk to neural or vascular structures. (pmc.ncbi.nlm.nih.gov)
Checking for “red flags” or conditions where an injection may not be appropriate. MRI can help identify problems such as infection, cancer, severe stenosis, large disc extrusion, or other findings that might require urgent evaluation or a different treatment.
Determining whether surgery or another treatment should be considered instead. If there is severe nerve compression, progressive weakness, or certain structural problems, an injection may not be the best next step.
Avoiding a low-yield injection. If the MRI does not show nerve-root compression or inflammation that matches your symptoms, the doctor may reconsider whether a steroid injection is likely to help.

That said, MRI is not always required for every case of sciatica. In people without red flags, many cases improve with time, physical therapy, and medication. Some studies have questioned whether routine MRI before lumbar epidural steroid injection always improves outcomes, but guidelines and many clinicians still favor imaging when symptoms persist and an injection or surgery is being considered. (pmc.ncbi.nlm.nih.gov)

You should ask your doctor: “What are you looking for on the MRI, and how would the result change the injection plan?” That usually clarifies whether the MRI is for diagnosis, safety, targeting, or deciding whether an injection is appropriate at all.

Cited

Lumbosacral Radiculopathy Workup: Imaging Studies, Other Tests, https://emedicine.medscape.com/article/95025-workup⁠(在新視窗中開啟)
Diagnostic and therapeutic spinal interventions: Epidural injections - PMC, https://pmc.ncbi.nlm.nih.gov/articles/PMC5764532/⁠(在新視窗中開啟)
Should routine MRI of the lumbar spine be required prior to lumbar epidural steroid injection for sciatica pain? - PMC, https://pmc.ncbi.nlm.nih.gov/articles/PMC4392534/⁠(在新視窗中開啟)

"
" i don’t know. "
" well, you’re going to have to find out. "
" i’m not sure i can. "
" you 'll figure it out. "
" i hope so. "
" i’m going to go back to the office and see if i can get some more information on this. "
" okay. "
" i 'll be back in a few hours. "
" okay. "
" bye. "
" bye. "
i hung up the phone and looked at the clock. it was almost eleven o'clock. i had been sitting there for over an hour. i got up and went into the bathroom and washed my face. i didn’t want to look like a slob. i put on some clean clothes and went back into the living room. i sat down on the couch and looked at the tv. i watched the news and then i heard a knock on the door. i jumped up and ran to the door.
" hello? " i called out.
" hi, i’m dr. peters. "
" oh, hi. " i smiled. " i’m sorry i

按一下模型名稱，即可查看其他模型回覆。

進展背後的醫學專業

這項進展由醫師共同塑造，他們協助我們定義、衡量並改進 ChatGPT 中的健康回覆。

OpenAI 與遍及 60 個國家/地區、涵蓋 49 種語言和 26 個醫學專科、超過 260 名醫師的全球網路合作。他們的回饋會影響 ChatGPT 在各種情境下如何回答健康問題，從日常保健疑問到更複雜的臨床情況皆然。

醫師會審查模型回覆範例，並評估其是否準確、清楚、完整、謹慎得當且實用。他們協助找出回覆可能遺漏重要背景的地方、語氣可能過於肯定的地方、應更清楚說明下一步的地方，或應更直接鼓勵使用者尋求醫療照護的地方。

截至目前，醫師已審查超過 700,000 則模型回覆範例，反映臨床人員與患者在真實世界中可能如何使用 ChatGPT。每隔幾分鐘，就有一名醫師審查一則新的回覆。他們的回饋會轉化為評分規準與評估標準，協助研究人員衡量回覆在真實健康情境中是否準確、安全、清楚、完整、謹慎得當且實用。這讓我們能更清楚地看見模型在哪些方面正在變好，以及哪些方面仍需改進。