Sora System Card
Introduction
Overview of Sora
Sora is OpenAI’s video generation model, designed to take text, image, and video inputs and generate a new video as an output. Users can create videos up to 1080p resolution (20 seconds max) in various formats, generate new content from text, or enhance, remix, and blend their own assets. Users will be able to explore the Featured and Recent feeds which showcase community creations and offer inspiration for new ideas. Sora builds on learnings from DALL·E and GPT models, and is designed to give people expanded tools for storytelling and creative expression.
Sora is a diffusion model, which generates a video by starting off with a base video that looks like static noise and gradually transforms it by removing the noise over many steps. By giving the model foresight of many frames at a time, we’ve solved a challenging problem of making sure a subject stays the same even when it goes out of view temporarily. Similar to GPT models, Sora uses a transformer architecture, unlocking superior scaling performance.
Sora uses the recaptioning technique from DALL·E 3, which involves generating highly descriptive captions for the visual training data. As a result, the model is able to follow the user’s text instructions in the generated video more faithfully.
In addition to being able to generate a video solely from text instructions, the model is able to take an existing still image and generate a video from it, animating the image’s contents with accuracy and attention to small detail. The model can also take an existing video and extend it or fill in missing frames. Sora serves as a foundation for models that can understand and simulate the real world, a capability we believe will be an important milestone for achieving AGI.
Sora’s capabilities may also introduce novel risks, such as the potential for misuse of likeness or the generation of misleading or explicit video content. In order to safely deploy Sora in a product, we built on learnings from safety work for DALL·E’s deployment in ChatGPT and the API and safety mitigations for other OpenAI products such as ChatGPT. This system card outlines the resulting mitigation stack, external red teaming efforts, evaluations, and ongoing research to refine these safeguards further.
Model Data
As described in our technical report1 from February 2024, Sora takes inspiration from large language models which acquire generalist capabilities by training on internet-scale data. The success of the LLM paradigm is enabled in part by the use of tokens that elegantly unify diverse modalities of text—code, math and various natural languages. With Sora, we considered how generative models of visual data can inherit such benefits. Whereas LLMs have text tokens, Sora has visual patches. Patches have previously been shown to be an effective representation for models of visual data. We found that patches are a highly-scalable and effective representation for training generative models on diverse types of videos and images. At a high level, we turn videos into patches by first compressing videos into a lower-dimensional latent space, and subsequently decomposing the representation into spacetime patches.
Sora was trained on diverse datasets, including a mix of publicly available data, proprietary data accessed through partnerships, and custom datasets developed in-house. These consist of:
Select publicly available data, mostly collected from industry-standard machine learning datasets and web crawls.
Proprietary data from data partnerships. We form partnerships to access non-publicly available data. For example, we partnered with Shutterstock Pond5 on building and delivering AI-generated images. We also partner to commission and create datasets fit for our needs.
Human data: Feedback from AI trainers, red teamers, and employees.
Pretraining filtering and Data Preprocessing
In addition to mitigations implemented after the pre-training stage, pre-training filtering mitigations can provide an additional layer of defense that, along with other safety mitigations, help exclude unwanted and harmful data from our datasets. Before training, all datasets thus undergo this filtering process, removing the most explicit, violent, or otherwise sensitive content (for instance, some hate symbols), representing an extension of the methods used to filter the data on which we trained our other models, including DALL·E 2 and DALL·E 3.
Risk Identification And Deployment Preparation
We undertook a robust process to understand both potential misuse and real-world creative uses to help inform Sora’s designs and safety mitigations. After the Sora announcement in February of 2024, we worked with hundreds of visual artists, designers, and filmmakers from more than 60 countries to gain feedback on how to advance the model to be most helpful for creative professionals. We also crafted a number of evaluations internally and with external red-teamers to discover and assess risks and iteratively improve our safety and risk mitigations.
Our safety stack for Sora builds on these learnings and on existing safety mitigations we employ in other models and products such as DALL·E and ChatGPT, as well as custom-built mitigations specific to our video product. Because this is a powerful tool, we are taking an iterative approach to safety, particularly in areas where context is important or we foresee novel risks related to video. Examples of our iterative approach include age gating access to users who are 18 or older, restricting the use of likeness/face-uploads, and having more conservative moderation thresholds on prompts and uploads of minors at launch. We want to continue to learn how people use Sora and iterate to best balance safety while maximizing creative potential for our users.
External Red Teaming
OpenAI worked with external red teamers located in nine different countries to test Sora, identify weaknesses in the safety mitigations, and give feedback on risks associated with Sora’s new product capabilities. Red teamers had access to the Sora product with various iterations of safety mitigations and system maturity starting in September and continuing into December 2024, testing more than 15,000 generations. This red teaming effort builds upon work in early 2024 where a Sora model without production mitigations was tested.
Red teamers explored novel potential risks of Sora’s model and the product’s tools, and tested safety mitigations as they were developed and improved. These red teaming campaigns covered various types of violative and disallowed content (sexual and erotic content, violence and gore, self harm, illegal content, mis/disinformation, etc), adversarial tactics (both prompting and tool/feature use) to evade safety mitigations, as well as how these tools could be exploited to progressively degrade moderation tools and safeguards. Red teamers also provided feedback on their perceptions of Sora on areas including bias and general performance.
We explored text-to-video generation using both straightforward prompts and adversarial prompting tactics across all content categories mentioned above. The media upload capability was tested with a large variety of images and videos, including public persons, and a broad variety of content categories to test the ability to generate violative content. We also tested various uses and combinations of the modification tools (storyboards, recut, remix, and blend) to assess their utility to generate prohibited content.
Red teamers identified noteworthy observations for both specific types of prohibited content and general adversarial tactics. For example, red teamers found that using text prompts with either medical situations or science fiction / fantasy settings degraded safeguards against generating erotic and sexual content until additional mitigations were built. Red teamers used adversarial tactics to evade elements of the safety stack, including suggestive prompts and using metaphors to harness the model’s inference capability. Over many attempts, they could identify trends of prompts and words which would trigger safeguards, and test different phrasing and words to evade refusals. Red teamers would eventually select the most-concerning generation to use as seed media for further development into violative content that couldn’t be created with single-prompt techniques. Jailbreak techniques sometimes proved effective to degrade safety policies, allowing us to refine these protections as well.
Red teamers also tested media uploads and Sora’s tools (storyboards, recut, remix, and blend) with both publicly available images and AI-generated media. This revealed gaps in input and output filtering to strengthen prior to Sora’s release, and helped hone protections for media uploads including people. Testing also revealed the need for stronger classifier filtering to mitigate the risk of non-violative media uploads being modified into prohibited erotic, violence, or deepfake content.
The feedback and data generated by red teamers enabled the creation of additional layers of safety mitigations and improvements on existing safety evaluations, which are described in the Specific Risk Areas and Mitigations sections. These efforts allowed additional tuning of our prompt filtering, blocklists, and classifier thresholds to ensure model compliance with safety goals.
Learnings from Early Artist Access
Over the last nine months, we observed user feedback across 500,000+ model requests from 300+ users from 60+ countries. This data informed enhancements in model behavior and model adherence to safety protocols. For example, artist feedback helped us understand the limitations a visible watermark has on their workflows, which informed our decision to allow paying users to download video files without the visible watermark while still embedding C2PA data.
This early access program also taught us that if Sora is intended to serve as an expanded tool for storytelling and creative expression, it would require us to offer more flexibility to artists around some sensitive areas we’d handle differently in a general-purpose tool like ChatGPT. We expect artists, independent filmmakers, studios and other entertainment industry organizations to use Sora as a crucial part of their development processes. At the same time, identifying both positive use cases and potential misuse allowed us to determine areas where more restrictive product-level mitigations were required to mitigate the risk of harm or misuse.
Evaluations
We developed internal evaluations targeting key areas, including nudity, deceptive election content, self-harm, and violence. These evaluations were designed to support the refinement of mitigations and help inform our moderation thresholds. The evaluation framework combines input prompts given to the video generation model with input and output classifiers applied to either transformed prompts or the final produced videos.
The input prompts for these evaluations were sourced from three primary channels: data collected during the early alpha phase (as outlined in Section 3.2), adversarial examples provided by red-team testers (referenced in Section 3.1), and synthetic data generated using GPT-4. Alpha phase data provided insight into real-world usage scenarios, red-teamer contributions helped uncover adversarial and edge-case content, and synthetic data allowed for expanding evaluation sets in areas like unintended racy content, where naturally occurring examples are scarce.
Preparedness
The preparedness framework is designed to evaluate whether frontier model capabilities introduce significant risks in four tracked categories: persuasion, cybersecurity, CBRN (chemical, biological, radiological, and nuclear), and model autonomy. We do not have evidence that Sora poses any significant risk with respect to cybersecurity, CBRN, or model autonomy. These risks are closely tied to models that interact with computer systems, scientific knowledge, or autonomous decision-making, all of which are currently beyond Sora’s scope as a video-generation tool.
Sora’s video generation capabilities could pose potential risk from persuasion, such as risks of impersonation, misinformation, or social engineering. To address these risks, we have developed a suite of mitigations that are described in the below sections. These include mitigations intended to prevent the generation of likeness to well-known public figures. Additionally, given that context and the knowledge of a video being real or AI-generated may be key in determining how persuasive a generated video is, we’ve focused on building a multi-layered provenance approach, including metadata, watermarks, and fingerprinting.
Sora Mitigation Stack
In addition to the specific risks and mitigations identified below, choices made in Sora’s training, product design, and policies help to broadly mitigate the risk of harmful or unwanted outputs. These can broadly be organized into system and model-level technical mitigations, as well as product policies and user education.
System and Model Mitigations
Below we detail the primary forms of safety mitigations we have in place before a user is shown their requested output:
Text and image moderation via multi-modal moderation classifier
Our multi-modal moderation classifier powering our external Moderation API is applied to identify text, image or video prompts that may violate our usage policies, both on input and outputs. Violative prompts detected by the system will result in a refusal. Learn more about our multi-modal moderation API here.2
Custom LLM filtering
One advantage of video generation technology is the ability to perform asynchronous moderation checks without adding latency to the overall user experience. Since video generation inherently takes a few seconds to process, this window of time can be utilized to run precision-targeted moderation checks. We have customized our own GPT to achieve high precision on the moderation for some specific topics, including identifying third-party content as well as deceptive content.
Filters are multimodal: both image/video uploads,text prompts and outputs are included in the context of each LLM call. This allows us to detect violating combinations across image and text.
Image output classifiers
To address potentially harmful content directly in outputs, Sora uses output classifiers, including specialized filters for NSFW content, minors, violence, and potential misuse of likeness. Sora may block videos before they are shared with the user if these classifiers are activated.
Blocklists
We maintain textual blocklists across a variety of categories, informed by our previous work on DALL·E 2 and DALL·E 3, proactive risk discovery, and results from early users.
Product Policies
In addition to the protections we have built into the model and system to prevent the generation of violative content, we are also taking additional steps to reduce the risk of misuse. We currently are only offering Sora to users who are 18 or older and we are applying moderation filters to the content that is shown in the Explore and Featured feeds.
We are also clearly communicating policy guidelines through in-product and publicly available education on:
Use of another person’s likeness without their permission, and a prohibition on depicting real minors;
Creating illegal content or content that violates intellectual property rights;
The generation of explicit and harmful content, such as non-consensual intimate imagery, content used to bully, harass, or defame, or content intended to promote violence, hatred, or the suffering of others; and
The creation and distribution of content used to defraud, scam, or mislead others.
Some of these forms of misuse are addressed through our model and system mitigations, but others are more contextual—a scene of a protest can be used for legitimate creative endeavors, but the same scene presented as a real current event could also be shared as disinformation if paired with other claims.
Sora is designed to give people the ability to express a wide range of creative ideas and views. It is not practical nor advisable to prevent every form of contextually problematic content.
We offer people the ability to report Sora videos they think may violate our guidelines while leveraging automation and human review to actively monitor patterns of use. We have established enforcement mechanisms to remove violative videos and penalize users. When users do violate our guidelines, we will notify them and offer the opportunity to tell us what they think is fair. We intend to track the effectiveness of these mitigations and refine them over time.
Specific Risk Areas and Mitigations
Beyond general safety measures above, early testing and evaluation helped identify several areas of particular safety focus.
Child Safety
OpenAI is deeply committed to addressing3 child safety risks, and we prioritize prevention, detection, and reporting of Child Sexual Abuse Material(opens in a new window) (CSAM) content across all our products, including Sora. OpenAI efforts in the child safety space include responsibly sourcing our data sets to protect them from CSAM, partnering with National Center for Missing & Exploited Children (NCMEC) to prevent child sexual abuse and protect children, red-teaming in accordance with Thorn’s recommendations and in compliance with legal restrictions, and robust scanning for CSAM across all inputs and outputs. This includes scanning first party and third party users (API and Enterprise) unless customers meet rigorous criteria for removal of CSAM scanning. To prevent generation of CSAM, we have built a robust safety stack, leveraging system mitigations we use across our other products such as ChatGPT and DALL·E4 as well as some additional levers that we built specifically for Sora.
Input Classifiers
For Child Safety we leverage 3 different input mitigations across text, image and video input:
For all image and video uploads, we integrate with Safer, developed by Thorn, to detect matches with known CSAM. Confirmed matches are rejected and reported to NCMEC. Additionally, we utilize Thorn’s CSAM classifier to identify potentially new, unhashed CSAM content.
We leverage a multi-modal moderation classifier to detect and moderate any sexual content that involves minors via text, image and video input.
For Sora, we developed a classifier to analyze text and images to predict whether an individual under the age of 18 is depicted or if the accompanying caption references a minor. We reject requests for image to video that contain under-18 individuals. If text-to-video is determined to be under 18, we enforce much stricter thresholds for moderation related to sexual, violent or self-harm content.
Below is our evaluation for our under-18 classifier for humans. We evaluate our classifier for rejecting realistic under-18 individuals on a dataset containing close to 5000 images across the categories of [child | adult] and [realistic | fictitious]. Our policy stance is to reject realistic children, while allowing fictitious images including animated, cartoon, or sketch style, provided they are non-sexual. We have taken a cautious approach to content involving minors, and will continue to evaluate our approach as we learn more through product use and find the right balance between allowing for creative expression and safety.
Currently, our classifiers are highly accurate, but they may occasionally flag adult or non-realistic images of children by mistake. Additionally, we acknowledge that studies and existing literature highlight the potential for age prediction models to exhibit racial biases. For instance, these models may systematically underestimate the age of individuals from certain racial groups.5 We are committed to enhancing the performance of our classifier, minimizing false positives, and deepening our understanding of potential biases over the coming months.
Expected outcome | n_samples | count (is_child) | count (not_child) | Evaluated metrics | |
---|---|---|---|---|---|
Realistic Child | Classify images as “is child” | 1589 | 1555 | 34 | Accuracy: 97.86% |
Realistic Adult | Classify images as “not child” | 1370 | 36 | 1334 | Accuracy: 99.28% |
Fictitious Adult | Classify images as “not child” | 965 | 7 | 958 | Accuracy: 97.37% |
Fictitious Child | Classify images as “not child” | 1050 | 323 | 727 | Accuracy: 69.24% |
Total | 4974 | 1921 | 3053 | Precision: 80.95% Recall: 97.86% |
Note: precision is calculated as the % of is_child classifications that are realistic children, and recall is calculated as the % of realistic child images that are classified as is_child
Output
As mentioned above, once we identify a reference to minors on text input with our under-18 classifier, we enforce strict thresholds for moderation related to sexual, violent or self harm content on output. Below are the two output classifiers we use to achieve this:
Multi modal moderation classifier which scans for unsafe video outputs reject requests that may be particularly sensitive
We are also leveraging our existing DALL·E image classifier to scan for violations related to child safety.
Our output classifiers scan 2 frames per second and when determining a video as unsafe we block any output.
In addition to our classifiers and automated moderation, we will have human review as an additional layer of protection against potential child safety violations.
Product Policy
Our policies prohibit the use of Sora for the generation of sexual content involving minors. Violations of our child safety policies can result in content removal and banning of the user.
Nudity & Suggestive Content
One of the emerging risk areas associated with AI video generation capabilities is the potential creation of NSFW (Not Safe for Work) or NCII (Non-Consensual Intimate Imagery) content. Similar to DALL·E’s approach, Sora uses a multi-tiered moderation strategy to block explicit content. These include prompt transformations, image output classifiers, and blocklists, which all contribute to a system that restricts suggestive content, particularly for age-appropriate outputs. Thresholds for our classifiers are stricter for image uploads than for text based prompts.
Videos shown in the Explore section are further filtered with heightened thresholds to target a viewing experience appropriate for a wide audience.
Below are the results of our evaluations on nudity and suggestive content, aimed at assessing the effectiveness of multi-layered mitigation across inputs and outputs. Based on findings we have iterated our thresholds and applied more strict moderation to images with uploads including people.
Category | Accuracy* (at input) | Accuracy* (at output, i.e. E2E) |
---|---|---|
Nudity & Suggestive Content | 97.25% | 97.59% |
Eval explanation:
N = total number of violating samples (~200 per category)
I = total number of violating samples passed by input moderation checks
O = total number of violating samples passed by output moderation checks
Accuracy at Input = (N - I) / N
Accuracy at Output (E2E) = (N - O) / N
Product Policy
Our policies prohibit the use of Sora for the generation of explicit sexual content, including non-consensual intimate imagery. Violations of these policies can result in content removal and penalization of the user.
Deceptive Content
Likeness Misuse and Harmful Deepfakes
Sora’s moderation monitor for likeness-based prompts is intended to flag potentially harmful deepfake content, with the intent that videos involving recognizable individuals are closely reviewed. The Likeness Misuse filter further flags prompts that attempt to modify or depict individuals in potentially harmful or misleading ways. Sora’s general prompt transformations further reduce the risk that Sora will generate the unwanted likeness of a private individual based on a prompt containing someone's name.
Deceptive Content
Sora’s input and output classifiers are intended to prevent the generation of deceptive content related to elections that depicts fraudulent, unethical or otherwise illegal activity. Sora’s evaluation metrics include classifiers to flag style or filter techniques that could produce misleading videos in the context of elections, thus reducing the risk of real-world misuse.
Below are the evaluations for our deceptive election content LLM filter, focused on helping identify cases where there may be intent to create prohibited content across a variety of inputs (e.g. text and video). Our system also scans 1 frame per second of output video to assess possible output violations.
Classifier | Recall | Precision | Result when flagged |
---|---|---|---|
Deceptive Election Content | 98.23% | 88.80% | Block generating output |
N=~500, based on synthetic data prompts
Investments in Provenance
Given that many risks associated with Sora, such as harmful deepfake content, are heavily context dependent, we've prioritized enhancing our provenance tools. We recognize that there is not a single solution to provenance, but are committed to improving the provenance ecosystem and helping build context and transparency to content created from Sora.
For general availability, our provenance safety tooling will include:
C2PA metadata on all assets (verifiable origin, industry standard)
Animated visible Sora watermarks by default (transparency of viewers that this ‘AI’)
Internal reverse video search tool, to help members of OpenAI’s Intelligence & Investigation team assess with high confidence if content is created by Sora
Product Policy
Our policies prohibit the use of Sora to defraud, scam, or mislead others, including through the creation and dissemination of disinformation. They also prohibit the use of another person’s likeness without their permission. Violations of these policies can result in content removal and penalization of the user.
Artist Styles
When a user employs the name of a living artist in a prompt, the model may generate video that resembles in some way the style of the artist’s works. There is a very long tradition in creativity of building off of other artists’ styles, but we appreciate that some creators may have concerns. We opted to take a conservative approach with this version of Sora as we learn more about how Sora is used by the creative community. To address this, we have added prompt re-writes that are designed to trigger when a user attempts to generate a video in the style of a living artist.
Similar to our other products, the Sora Editor uses an LLM to rewrite submitted text to facilitate prompting Sora more effectively. This process promotes compliance with our guidelines, including removing public figure names, grounding people with specific attributes, and describing branded objects in a generic way. We maintain textual blocklists across a variety of categories, informed by our previous work on DALL·E 2 and DALL·E 3, proactive risk discovery, and results from red teamers and early users.
Future Work
OpenAI employs an iterative deployment strategy to ensure the responsible and effective roll-out of its products. This approach combines phased rollouts, ongoing testing, and continuous monitoring with user feedback and real-world data to refine and improve our performance and safety mitigations over time. Below is a series of work we are planning to do as part of our iterative deployment for Sora.
Likeness pilot
The ability to generate a video using an uploaded photo or video of a real person as the “seed” is a vector of potential misuse that we are taking a particularly incremental approach toward to learn from early patterns of use. Early feedback from artists indicate that this is a powerful creative tool they value, but given the potential for abuse, we are not initially making it available to all users. Instead, in keeping with our practice of iterative deployment, the ability to upload images or videos of people will be made available to a subset of users and we will have active, in depth monitoring in place to understand the value of it to the Sora community and to adjust our approach to safety as we learn. Uploads containing images of minors will not be permitted during this test.
Provenance and transparency Initiatives
Sora’s future iterations will continue to strengthen traceability through research into reverse embedding search tools and continued implementation on transparency measures such as C2PA. We are excited to explore potential partnerships with NGOs and research organizations to grow and improve the provenance ecosystem and test our internal reverse image tool for Sora.
Expanding representation in our outputs
We are committed to reducing potential output biases through prompt refinements, feedback loops, and the ongoing identification of effective mitigations—recognizing that overcorrections can be equally harmful. We acknowledge challenges such as body image bias and demographic representation and will continue refining our approach to ensure balanced and inclusive outputs.
Continued safety, policy, and ethical alignment
OpenAI plans to maintain ongoing evaluations of Sora and efforts to further improve Sora’s adherence to OpenAI’s policies and safety standards. Additional improvements in areas such as likeness safety and deceptive content are planned, guided by evolving best practices and user feedback.
Acknowledgements
Thank you to all of OpenAI's internal teams, including Comms, Comms Design, Global Affairs, Integrity, Intel & Investigations, Legal, Product Policy, Safety Systems and User Ops, whose support was instrumental in helping develop and implement Sora’s safety mitigations as well as their contributions to this system card.
We are grateful to our group of Alpha artists and our expert red teamers who provided feedback, helped test our models at early stages of development and informed our risk assessments and evaluations. Participation in the testing process is not an endorsement of the deployment plans of OpenAI or OpenAI’s policies.
Red Teaming Individuals (alphabetical): Alexandra García Pérez, Arjun Singh Puri, Caroline Friedman Levy, Dani Madrid-Morales, Emily Lynell Edwards, Grant Brailsford, Herman Wasserman, Javier García Arredondo, Kate Turetsky, Kelly Bare, Matt Groh, Maximilian Müller, Naomi Hart, Nathan Heath, Patrick Caughey, Per Wikman Svahn, Rafael González-Vázquez, Sara Kingsley, Shelby Grossman, Vincent Nestler
Red Teaming Organizations: ScaleAI