Scaling Goal-based Exploration via Pruning Proto-goals

Scaling Goal-based Exploration via Pruning Proto-goals

Akhil Bagaria, Tom Schaul

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 3451-3460. https://doi.org/10.24963/ijcai.2023/384

One of the gnarliest challenges in reinforcement learning (RL) is exploration that scales to vast domains, where novelty-, or coverage-seeking behaviour falls short. Goal-directed, purposeful behaviours are able to overcome this, but rely on a good goal space. The core challenge in goal discovery is finding the right balance between generality (not hand-crafted) and tractability (useful, not too many). Our approach explicitly seeks the middle ground, enabling the human designer to specify a vast but meaningful proto-goal space, and an autonomous discovery process to refine this to a narrower space of controllable, reachable, novel, and relevant goals. The effectiveness of goal-conditioned exploration with the latter is then demonstrated in three challenging environments.
Keywords:
Machine Learning: ML: Reinforcement learning
Machine Learning: ML: Deep reinforcement learning