Szymon Sidor

4 posts

Competitive Self-Play

Competitive Self-Play

We've found that self-play allows simulated AIs to discover physical skills like tackling, ducking, faking, kicking, catching, and diving for the ball, without explicitly designing an environment with these skills in mind.


2 minute read

Better Exploration with Parameter Noise

Better Exploration with Parameter Noise

We've found that adding adaptive noise to the parameters of reinforcement learning algorithms frequently boosts performance. This exploration method is simple to implement and very rarely decreases performance, so it's worth trying on any problem.


4 minute read

OpenAI Baselines: DQN

OpenAI Baselines: DQN

We're open-sourcing OpenAI Baselines, our internal effort to reproduce reinforcement learning algorithms with performance on par with published results. We'll release the algorithms over upcoming months; today's release includes DQN and three of its variants.


4 minute read

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

We've discovered that evolution strategies (ES), an optimization technique that's been known for decades, rivals the performance of standard reinforcement learning (RL) techniques on modern RL benchmarks, while overcoming many of RL's inconveniences.


12 minute read