Learning Montezuma's Revenge from a Single Demonstration
Evolved Policy Gradients
Learning to Model Other Minds
Better Exploration with Parameter Noise
We've found that adding adaptive noise to the parameters of reinforcement learning algorithms frequently boosts performance. This exploration method is simple to implement and very rarely decreases performance, so it's worth trying on any problem.
4 minute read