The Successful Ingredients of Policy Gradient Algorithms

The Successful Ingredients of Policy Gradient Algorithms

Sven Gronauer, Martin Gottwald, Klaus Diepold

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
Main Track. Pages 2455-2461. https://doi.org/10.24963/ijcai.2021/338

Despite the sublime success in recent years, the underlying mechanisms powering the advances of reinforcement learning are yet poorly understood. In this paper, we identify these mechanisms - which we call ingredients - in on-policy policy gradient methods and empirically determine their impact on the learning. To allow an equitable assessment, we conduct our experiments based on a unified and modular implementation. Our results underline the significance of recent algorithmic advances and demonstrate that reaching state-of-the-art performance may not need sophisticated algorithms but can also be accomplished by the combination of a few simple ingredients.
Keywords:
Machine Learning: Deep Reinforcement Learning
AI Ethics, Trust, Fairness: Reproducibility
Multidisciplinary Topics and Applications: Validation and Verification