Geoffrey Irving

3 posts

Fine-Tuning GPT-2 from Human Preferences

AI Safety Needs Social Scientists

AI Safety via Debate