Using Natural Language for Reward Shaping in Reinforcement Learning

Prasoon Goyal; Scott Niekum; Raymond J. Mooney

Using Natural Language for Reward Shaping in Reinforcement Learning

Prasoon Goyal, Scott Niekum, Raymond J. Mooney

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence

Main track. Pages 2385-2391. https://doi.org/10.24963/ijcai.2019/331

PDF BibTeX

Recent reinforcement learning (RL) approaches have shown strong performance in complex domains, such as Atari games, but are highly sample inefficient. A common approach to reduce interaction time with the environment is to use reward shaping, which involves carefully designing reward functions that provide the agent intermediate rewards for progress towards the goal. Designing such rewards remains a challenge, though. In this work, we use natural language instructions to perform reward shaping. We propose a framework that maps free-form natural language instructions to intermediate rewards, that can seamlessly be integrated into any standard reinforcement learning algorithm. We experiment with Montezuma's Revenge from the Atari video games domain, a popular benchmark in RL. Our experiments on a diverse set of 15 tasks demonstrate that for the same number of interactions with the environment, using language-based rewards can successfully complete the task 60% more often, averaged across all tasks, compared to learning without language.

Keywords:

Machine Learning: Reinforcement Learning

Natural Language Processing: Natural Language Processing

Uncertainty in AI: Sequential Decision Making