March 24, 2023

March 20 ChatGPT outage: Here’s what happened

An update on our findings, the actions we’ve taken, and technical details of the bug.

Loading…

We took ChatGPT offline earlier this week due to a bug in an open-source library which allowed some users to see titles from another active user’s chat history. It’s also possible that the first message of a newly-created conversation was visible in someone else’s chat history if both users were active around the same time.

The bug is now patched. We were able to restore both the ChatGPT service and, later, its chat history feature, with the exception of a few hours of history. As promised, we’re publishing more technical details⁠ of this problem below.

Upon deeper investigation, we also discovered that the same bug may have caused the unintentional visibility of payment-related information of 1.2% of the ChatGPT Plus subscribers who were active during a specific nine-hour window. In the hours before we took ChatGPT offline on Monday, it was possible for some users to see another active user’s first and last name, email address, payment address, credit card type and the last four digits (only) of a credit card number, and credit card expiration date. Full credit card numbers were not exposed at any time.

We believe the number of users whose data was actually revealed to someone else is extremely low. To access this information, a ChatGPT Plus subscriber would have needed to do one of the following:

Open a subscription confirmation email sent on Monday, March 20, between 1 a.m. and 10 a.m. Pacific time. Due to the bug, some subscription confirmation emails generated during that window were sent to the wrong users. These emails contained the credit card type and last four digits of another user’s credit card number, but full credit card numbers did not appear. It’s possible that a small number of subscription confirmation emails might have been incorrectly addressed prior to March 20, although we have not confirmed any instances of this.
In ChatGPT, click on “My account,” then “Manage my subscription” between 1 a.m. and 10 a.m. Pacific time on Monday, March 20. During this window, another active ChatGPT Plus user’s first and last name, email address, payment address, the credit card type and last four digits (only) of a credit card number, and credit card expiration date might have been visible. It’s possible that this also could have occurred prior to March 20, although we have not confirmed any instances of this.

We have reached out to notify affected users that their payment information may have been exposed. We are confident that there is no ongoing risk to users’ data.

Everyone at OpenAI is committed to protecting our users’ privacy and keeping their data safe. It’s a responsibility we take incredibly seriously. Unfortunately, this week we fell short of that commitment, and of our users’ expectations. We apologize again to our users and to the entire ChatGPT community and will work diligently to rebuild trust.

Technical details

The bug was discovered in the Redis client open-source library, redis-py. As soon as we identified the bug, we reached out to the Redis maintainers with a patch to resolve the issue. Here’s how the bug worked:

We use Redis to cache user information in our server so we don’t need to check our database for every request.
We use Redis Cluster to distribute this load over multiple Redis instances.
We use the redis-py library to interface with Redis from our Python server, which runs with Asyncio.
The library maintains a shared pool of connections between the server and the cluster, and recycles a connection to be used for another request once done.
When using Asyncio, requests and responses with redis-py behave as two queues: the caller pushes a request onto the incoming queue, and will pop a response from the outgoing queue, and then return the connection to the pool.
If a request is canceled after the request is pushed onto the incoming queue, but before the response popped from the outgoing queue, we see our bug: the connection thus becomes corrupted and the next response that’s dequeued for an unrelated request can receive data left behind in the connection.
In most cases, this results in an unrecoverable server error, and the user will have to try their request again.
But in some cases the corrupted data happens to match the data type the requester was expecting, and so what gets returned from the cache appears valid, even if it belongs to another user.
At 1 a.m. Pacific time on Monday, March 20, we inadvertently introduced a change to our server that caused a spike in Redis request cancellations. This created a small probability for each connection to return bad data.

This bug only appeared in the Asyncio redis-py client for Redis Cluster, and has now been fixed.

Actions we’ve taken

As our investigation wraps up, supporting and informing our users is our top priority.

We have taken the following actions to improve our systems:

Extensively tested our fix to the underlying bug.
Added redundant checks to ensure the data returned by our Redis cache matches the requesting user.
Programatically examined our logs to make sure that all messages are only available to the correct user.
Correlated several data sources to precisely identify the affected users so that we can notify them.
Improved logging to identify when this is happening and fully confirm it has stopped.
Improved the robustness and scale of our Redis cluster to reduce the likelihood of connection errors at extreme load.

Where we go from here

The Redis open-source maintainers have been fantastic collaborators, swiftly addressing the bug and rolling out a patch. Redis, along with other open-source software, plays a crucial role in our research efforts. Their significance cannot be understated—we would not have been able to scale ChatGPT without Redis. We’re dedicated to continually supporting and contributing to the Redis community.