Research

A research agenda for assessing the economic impacts of code generation models

Abstract

OpenAI is developing a research program to assess the economic impacts of code generation models and is inviting collaboration with external researchers. Rapid advances in the capabilities of large language models (LLMs) trained on code have made it increasingly important to study their economic impacts on individuals, firms, and society. Codex – an LLM developed by OpenAI by fine-tuning GPT-3 on billions of lines of publicly available code from GitHub – has been shown to generate functionally correct code 28.8% of the time on a sample of evaluation problems (Chen et al. 2021). This may have important implications for the future of coding and the economics of the industries that depend on it. In this document, we lay out a research agenda to assess the effects of Codex on economic factors of interest to policymakers, firms, and the public. We make a case for this research agenda by highlighting the potentially broad applicability of code generation models to software development, the potential for other LLMs to create significant social and economic impact as model capabilities advance, and the value of using Codex to generate evidence and establish methodologies that may be applicable to research on the economic impacts of future models. We propose that academic and policy research focus on studying code generation models and other LLMs so that evidence on their economic impacts can be used to inform decision-making in three key areas: Deployment policy, AI system design, and public policy. To help guide this research, we outline six priority outcome areas within the realm of economic impacts that we intend to use Codex to study: Productivity, Employment, Skill Development, Inter-firm Competition, Consumer Prices, and Economic Inequality. For each area, we briefly discuss previous literature on the impacts of artificial intelligence on each of these outcomes, describe questions that we believe to be key inputs to the three decision-making areas mentioned above, and provide examples of research that could be conducted with Codex. To catalyze work that builds off of this initial research agenda, we are announcing a Call for Expressions of Interest from external researchers to collaborate with OpenAI researchers and customers to better measure the economic impacts of code generation models and other LLMs.

Acknowledgments

Authors, equal contribution

Sam Manning (OpenResearch)
Pamela Mishkin (OpenAI)

Authors

Gillian Hadfield (University of Toronto)
Tyna Eloundou (OpenAI)
Emily Eisner (University of California, Berkeley)