State-Based Recurrent SPMNs for Decision-Theoretic Planning under Partial Observability

State-Based Recurrent SPMNs for Decision-Theoretic Planning under Partial Observability

Layton Hayes, Prashant Doshi, Swaraj Pawar, Hari Teja Tatavarti

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
Main Track. Pages 2526-2533. https://doi.org/10.24963/ijcai.2021/348

The sum-product network (SPN) has been extended to model sequence data with the recurrent SPN (RSPN), and to decision-making problems with sum-product-max networks (SPMN). In this paper, we build on the concepts introduced by these extensions and present state-based recurrent SPMNs (S-RSPMNs) as a generalization of SPMNs to sequential decision-making problems where the state may not be perfectly observed. As with recurrent SPNs, S-RSPMNs utilize a repeatable template network to model sequences of arbitrary lengths. We present an algorithm for learning compact template structures by identifying unique belief states and the transitions between them through a state matching process that utilizes augmented data. In our knowledge, this is the first data-driven approach that learns graphical models for planning under partial observability, which can be solved efficiently. S-RSPMNs retain the linear solution complexity of SPMNs, and we demonstrate significant improvements in compactness of representation and the run time of structure learning and inference in sequential domains.
Keywords:
Machine Learning: Learning Graphical Models
Planning and Scheduling: Model-Based Reasoning
Planning and Scheduling: Planning under Uncertainty