Paul Christiano

5 posts

Fine-Tuning GPT-2 from Human Preferences

Learning Complex Goals with Iterated Amplification

Gathering Human Feedback

Learning from Human Preferences

Concrete AI Safety Problems