Skip to main content

July 10, 2024

OpenAI and Los Alamos National Laboratory announce bioscience research partnership

OpenAI and Los Alamos National Laboratory are developing evaluations to understand how multimodal AI models can be used safely by scientists in laboratory settings.

LosAlamos OpenAI

OpenAI and Los Alamos National Laboratory (LANL) – one of the United States’ leading national laboratories – are working together to study how artificial intelligence can be used safely by scientists in laboratory settings to advance bioscientific research. This partnership follows a long tradition of the U.S. public sector, and in particular the national labs, working with the U.S. private sector to ensure advances in innovation translate to advancements in essential areas like health care and bioscience. 

The recent White House Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence(opens in a new window) tasks the U.S. Department of Energy’s national labs to help evaluate the capabilities of frontier AI models, including biological capabilities. This is important to OpenAI because we believe AI has the potential to multiply the speed and impact of science for good. Already, Moderna is leveraging OpenAI’s technology to augment clinical trial development by building a data-analysis assistant designed to help analyze large data sets. Color Health built a new copilot using GPT-4o to assist healthcare providers to make evidence-based decisions about cancer screening and treatment. 

“As a private company dedicated to serving the public interest, we’re thrilled to announce a first-of-its-kind partnership with Los Alamos National Laboratory to study bioscience capabilities,” said Mira Murati, OpenAI’s Chief Technology Officer. “This partnership marks a natural progression in our mission, advancing scientific research, while also understanding and mitigating risks.”

“AI is a powerful tool that has the potential for great benefits in the field of science, but, as with any new technology, comes with risks,” said Nick Generous, deputy group leader for Information Systems and Modeling.  "At Los Alamos this work will be led by the laboratory's new AI Risks Technical Assessment Group, which will help assess and better understand those risks.”

OpenAI and Los Alamos National Laboratory’s Bioscience Division are working on an evaluation study to assess how frontier models like GPT-4o can assist humans with performing tasks in a physical laboratory setting through multimodal capabilities like vision and voice. This includes biological safety evaluations for GPT-4o and its currently unreleased real-time voice systems to understand how they could be used to support research in bioscience. We believe our upcoming evaluation will be the first of its kind and contribute to state-of-the-art research on AI biosecurity evaluations. It will build upon our existing work on biothreat risks and follow our Preparedness Framework, which outlines our approach to tracking, evaluating, forecasting, and protecting against model risks, and is consistent with our commitments to Frontier AI Safety agreed at the 2024 AI Seoul Summit. 

Our upcoming evaluation with Los Alamos will be the first experiment to test multimodal frontier models in a lab setting by assessing the abilities of both experts and novices to perform and troubleshoot a safe protocol consisting of standard laboratory experimental tasks. These tasks are intended to serve as a proxy for more complex tasks that pose a dual use concern. Tasks may include transformation (e.g., introducing foreign genetic material into a host organism; cell culture (e.g., maintaining and propagating cells in vitro), and cell separation (e.g., through centrifugation). By examining the uplift in task completion and accuracy enabled by GPT-4o, we aim to quantify and assess how frontier models can upskill both existing professionals / PhDs as well as novices in real-world biological tasks. 

These new evaluations extend our previous work in several new dimensions: 

  1. Incorporating wet lab techniques. Written tasks and responses for synthesizing and disseminating compounds were indicative, but do not fully capture the skills required to actually conduct biological benchwork. For example, it may be easy to know one must conduct mass spectrometry or even detail the steps in writing; it is much harder to perform correctly, with real samples.

  2. Incorporating multiple modalities. Our previous work focused on GPT-4, which involved written outputs. GPT-4o’s ability to reason across modalities and take voice and visual inputs can potentially expedite learning. For example, a user less familiar with all the components of a wet lab setup can simply show their setup to GPT-4o and prompt it with questions, and troubleshoot scenarios visually through the camera instead of needing to convey the situation as a written question.

Los Alamos National Laboratory has been a pioneer in safety research and we look forward to working together on novel and robust safety evaluations for frontier AI models as capabilities continue to rapidly improve. This cooperative effort not only underscores the potential of multimodal AI models like GPT-4o to support scientific research, but also emphasizes the critical importance of private and public sector collaboration in both leveraging innovation and ensuring safety. As we look forward to the results of these evaluations, we hope that this partnership will help set new standards for AI safety and efficacy in the sciences, paving the way for future innovations that benefit humanity.