Beyond the Map: Learning to Navigate Unseen Urban Dynamics Using Diffusion-Guided Deep Reinforcement Learning
Beyond the Map: Learning to Navigate Unseen Urban Dynamics Using Diffusion-Guided Deep Reinforcement Learning
Monu Nagar, Debasis Das
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Main Track. Pages 8750-8758.
https://doi.org/10.24963/ijcai.2025/973
Vision-based motion planning is a crucial task in Autonomous Driving (AD). Recent advancements in urban AD show that integrating Imitation Learning (IL) with Deep Reinforcement Learning (DRL) improves decision-making to be more like humans. However, IL methods depend on expert demonstrations to learn the optimal policy. The main drawback of this approach is the assumption that expert demonstrations are always optimal, which is not always true in real-world settings. This creates challenges in adapting to diverse weather conditions and dynamic traffic scenarios, often resulting in higher collision rates and increased risks to pedestrian safety. To address these challenges, we propose a Diffusion-Guided Deep Reinforcement Learning (DGDRL) framework that integrates a diffusion model with a Soft Actor-Critic DRL method to effectively mitigate environmental uncertainties and enable self-learning beyond the training maps for new tasks. This framework follows a novel modified partially observable Markov decision process (mPOMDP) to choose optimal action from original and diffusion-generated observations, ensuring that the policy behavior remains consistent with the current action. We use the CARLA NoCrash benchmark to train and evaluate the proposed framework. The method is validated in diverse urban environments (e.g., empty, regular, and dense) across multiple towns. Additionally, we compare our model against state-of-the-art techniques to ensure robustness and generalizability to new environments. The project page and code are available at the link https://autovisionproject.github.io/project/.
Keywords:
Robotics: ROB: Robotics and vision
Machine Learning: ML: Deep learning architectures
Machine Learning: ML: Reinforcement learning
Robotics: ROB: Motion and path planning
