Jared Kaplan

1 post

How AI Training Scales

How AI Training Scales

We've discovered that the gradient noise scale, a simple statistical metric, predicts the parallelizability of neural network training on a wide range of tasks.


7 minute read