Prafulla Dhariwal

3 posts

Glow: Better Reversible Generative Models

Better Exploration with Parameter Noise

Better Exploration with Parameter Noise

We've found that adding adaptive noise to the parameters of reinforcement learning algorithms frequently boosts performance. This exploration method is simple to implement and very rarely decreases performance, so it's worth trying on any problem.


4 minute read

Proximal Policy Optimization

Proximal Policy Optimization

We’re releasing a new class of reinforcement learning algorithms, Proximal Policy Optimization (PPO), which perform comparably or better than state-of-the-art approaches while being much simpler to implement and tune.


3 minute read