The Successful Ingredients of Policy Gradient Algorithms
The Successful Ingredients of Policy Gradient Algorithms
Sven Gronauer, Martin Gottwald, Klaus Diepold
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
Main Track. Pages 2455-2461.
https://doi.org/10.24963/ijcai.2021/338
Despite the sublime success in recent years, the underlying mechanisms powering the advances of reinforcement learning are yet poorly understood. In this paper, we identify these mechanisms - which we call ingredients - in on-policy policy gradient methods and empirically determine their impact on the learning. To allow an equitable assessment, we conduct our experiments based on a unified and modular implementation. Our results underline the significance of recent algorithmic advances and demonstrate that reaching state-of-the-art performance may not need sophisticated algorithms but can also be accomplished by the combination of a few simple ingredients.
Keywords:
Machine Learning: Deep Reinforcement Learning
AI Ethics, Trust, Fairness: Reproducibility
Multidisciplinary Topics and Applications: Validation and Verification