Emergent Tangled Program Graphs in Multi-Task Learning

Emergent Tangled Program Graphs in Multi-Task Learning

Stephen Kelly, Malcolm Heywood

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Best Sister Conferences. Pages 5294-5298. https://doi.org/10.24963/ijcai.2018/740

We propose a Genetic Programming (GP) framework to address high-dimensional Multi-Task Reinforcement Learning (MTRL) through emergent modularity. A bottom-up process is assumed in which multiple programs self-organize into collective decision-making entities, or teams, which then further develop into multi-team policy graphs, or Tangled Program Graphs (TPG). The framework learns to play three Atari video games simultaneously, producing a single control policy that matches or exceeds leading results from (game-specific) deep reinforcement learning in each game. More importantly, unlike the representation assumed for deep learning, TPG policies start simple and adaptively complexify through interaction with the task environment, resulting in agents that are exceedingly simple, operating in real-time without specialized hardware support such as GPUs.
Keywords:
Machine Learning: Reinforcement Learning
Machine Learning: Transfer, Adaptation, Multi-task Learning
Multidisciplinary Topics and Applications: Computer Games