Inferring Time-delayed Causal Relations in POMDPs from the Principle of Independence of Cause and Mechanism

Inferring Time-delayed Causal Relations in POMDPs from the Principle of Independence of Cause and Mechanism

Junchi Liang, Abdeslam Boularias

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
Main Track. Pages 1944-1950. https://doi.org/10.24963/ijcai.2021/268

This paper introduces an algorithm for discovering implicit and delayed causal relations between events observed by a robot at regular or arbitrary times, with the objective of improving data-efficiency and interpretability of model-based reinforcement learning (RL) techniques. The proposed algorithm initially predicts observations with the Markov assumption, and incrementally introduces new hidden variables to explain and reduce the stochasticity of the observations. The hidden variables are memory units that keep track of pertinent past events. Such events are systematically identified by their information gains. A test of independence between inputs and mechanisms is performed to identify cases when there is a causal link between events and those when the information gain is due to confounding variables. The learned transition and reward models are then used in a Monte Carlo tree search for planning. Experiments on simulated and real robotic tasks, and the challenging 3D game Doom show that this method significantly improves over current RL techniques.
Keywords:
Knowledge Representation and Reasoning: Action, Change and Causality
Robotics: Cognitive Robotics