The Successful Ingredients of Policy Gradient Algorithms

Sven Gronauer; Martin Gottwald; Klaus Diepold

doi:10.24963/ijcai.2021/338

The Successful Ingredients of Policy Gradient Algorithms

Sven Gronauer, Martin Gottwald, Klaus Diepold

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence

Main Track. Pages 2455-2461. https://doi.org/10.24963/ijcai.2021/338

PDF BibTeX

Despite the sublime success in recent years, the underlying mechanisms powering the advances of reinforcement learning are yet poorly understood. In this paper, we identify these mechanisms - which we call ingredients - in on-policy policy gradient methods and empirically determine their impact on the learning. To allow an equitable assessment, we conduct our experiments based on a unified and modular implementation. Our results underline the significance of recent algorithmic advances and demonstrate that reaching state-of-the-art performance may not need sophisticated algorithms but can also be accomplished by the combination of a few simple ingredients.

Keywords:

Machine Learning: Deep Reinforcement Learning

AI Ethics, Trust, Fairness: Reproducibility

Multidisciplinary Topics and Applications: Validation and Verification