Research

Frontier AI regulation: Managing emerging risks to public safety

Abstract

Advanced AI models hold the promise of tremendous benefits for humanity, but society needs to proactively manage the accompanying risks. In this paper, we focus on what we term “frontier AI” models: highly capable foundation models that could possess dangerous capabilities sufficient to pose severe risks to public safety. Frontier AI models pose a distinct regulatory challenge: dangerous capabilities can arise unexpectedly; it is difficult to robustly prevent a deployed model from being misused; and, it is difficult to stop a model’s capabilities from proliferating broadly. To address these challenges, at least three building blocks for the regulation of frontier models are needed: (1) standard-setting processes to identify appropriate requirements for frontier AI developers, (2) registration and reporting requirements to provide regulators with visibility into frontier AI development processes, and (3) mechanisms to ensure compliance with safety standards for the development and deployment of frontier AI models. Industry self-regulation is an important first step. However, wider societal discussions and government intervention will be needed to create standards and to ensure compliance with them. We consider several options to this end, including granting enforcement powers to supervisory authorities and licensure regimes for frontier AI models. Finally, we propose an initial set of safety standards. These include conducting pre-deployment risk assessments; external scrutiny of model behavior; using risk assessments to inform deployment decisions; and monitoring and responding to new information about model capabilities and uses post-deployment. We hope this discussion contributes to the broader conversation on how to balance public safety risks and innovation benefits from advances at the frontier of AI development.

Acknowledgments

Report authors, alphabetical order

Markus Anderljung (Centre for the Governance of AI; Center for a New American Security) *†
Joslyn Barnhart (Google DeepMind) **
Jade Leung (OpenAI) *
Anton Korinek (Brookings Institution; University of Virginia; Centre for the Governance of AI) **†
Cullen O’Keefe (OpenAI) *
Jess Whittlestone (Centre for Long Term Resilience) **

Report authors, alphabetical order

Shahar Avin (Centre for the Study of Existential Risk, Univeristy of Cambridge)
Miles Brundage (OpenAI)
Justin Bullock (University of Washington; Convergence Analysis)
Duncan Cass-Beggs (Centre for International Governance Innovation)
Ben Chang (The Andrew W. Marshall Foundation)
Tantum Collins (GETTING-Plurality
Network, Edmond & Lily Safra Center for Ethics; Harvard University)
Tim Fist (Center for a New American Security)
Gillian Hadfield (University of Toronto; Vector Institute; OpenAI)
Alan Hayes (Akin Gump Strauss Hauer & Feld LLP)
Lewis Ho (Google DeepMind)
Sara Hooker (Cohere For AI)
Eric Horvitz (Microsoft)
Noam Kolt (University of Toronto)
Jonas Schuett (Centre for the Governance of AI)
Yonadav Shavit (Harvard University) ***
Divya Siddarth (Collective Intelligence Project)
Robert Trager (Centre for the Governance of AI; University of California: Los Angeles)
Kevin Wolf (Akin Gump Strauss Hauer & Feld LLP)


Listed authors contributed substantive ideas and/or work to the white paper. Contributions include writing, editing, research, detailed feedback, and participation in a workshop on a draft of the paper. Given the size of the group, inclusion as an author does not entail endorsement of all claims in the paper, nor does inclusion entail an endorsement on the part of any individual’s organization.

*Significant contribution, including writing, research, convening, and setting the direction of the paper.
**Significant contribution, including editing, convening, detailed input, and setting the direction of the paper.
***Work done while an independent contractor for OpenAI.
†Corresponding authors. Markus Anderljung (markus.anderljung@governance.ai) and Anton Korinek (akorinek@brookings.edu).