BILE: An Effective Behavior-based Latent Exploration Scheme for Deep Reinforcement Learning

BILE: An Effective Behavior-based Latent Exploration Scheme for Deep Reinforcement Learning

Yiming Wang, Kaiyan Zhao, Yan Li, Leong Hou U

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Main Track. Pages 6497-6505. https://doi.org/10.24963/ijcai.2025/723

Efficient exploration of state spaces is critical for the success of deep reinforcement learning (RL). While many methods leverage exploration bonuses to encourage exploration instead of relying solely on extrinsic rewards, these bonus-based approaches often face challenges with learning efficiency and scalability, especially in environments with high-dimensional state spaces. To address these issues, we propose BehavIoral metric-based Latent Exploration (BILE). The core idea is to learn a compact representation within the behavioral metric space that preserves value differences between states. By introducing additional rewards to encourage exploration in this latent space, BILE drives the agent to visit states with higher value diversity and exhibit more behaviorally distinct actions, leading to more effective exploration of the state space. Additionally, we present a novel behavioral metric for efficient and robust training of the state encoder, backed by theoretical guarantees. Extensive experiments on high-dimensional environments, including realistic indoor scenarios in Habitat, robotic tasks in Robosuite, and challenging discrete Minigrid benchmarks, demonstrate the superiority and scalability of our method over other approaches.
Keywords:
Machine Learning: ML: Reinforcement learning