Filip Wolski

3 posts

Evolved Policy Gradients

Proximal Policy Optimization

Proximal Policy Optimization

We’re releasing a new class of reinforcement learning algorithms, Proximal Policy Optimization (PPO), which perform comparably or better than state-of-the-art approaches while being much simpler to implement and tune.


3 minute read

Robots that Learn

Robots that Learn

We've created a robotics system, trained entirely in simulation and deployed on a physical robot, which can learn a new task after seeing it done once.


3 minute read