We're releasing Safety Gym, a suite of environments and tools for measuring progress towards reinforcement learning agents which respect safety constraints while training.
Fine-Tuning GPT-2 from Human Preferences
Better Language Models and Their Implications
We’ve trained a large-scale unsupervised language model which generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarization.
How AI Training Scales
We've discovered that the gradient noise scale, a simple statistical metric, predicts the parallelizability of neural network training on a wide range of tasks.