We're launching a transfer learning contest that measures a reinforcement learning algorithm's ability to generalize from previous experience.
We’re releasing a new class of reinforcement learning algorithms, Proximal Policy Optimization (PPO), which perform comparably or better than state-of-the-art approaches while being much simpler to implement and tune.