State-Based Recurrent SPMNs for Decision-Theoretic Planning under Partial Observability

Layton Hayes; Prashant Doshi; Swaraj Pawar; Hari Teja Tatavarti

doi:10.24963/ijcai.2021/348

State-Based Recurrent SPMNs for Decision-Theoretic Planning under Partial Observability

Layton Hayes, Prashant Doshi, Swaraj Pawar, Hari Teja Tatavarti

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence

Main Track. Pages 2526-2533. https://doi.org/10.24963/ijcai.2021/348

PDF BibTeX

The sum-product network (SPN) has been extended to model sequence data with the recurrent SPN (RSPN), and to decision-making problems with sum-product-max networks (SPMN). In this paper, we build on the concepts introduced by these extensions and present state-based recurrent SPMNs (S-RSPMNs) as a generalization of SPMNs to sequential decision-making problems where the state may not be perfectly observed. As with recurrent SPNs, S-RSPMNs utilize a repeatable template network to model sequences of arbitrary lengths. We present an algorithm for learning compact template structures by identifying unique belief states and the transitions between them through a state matching process that utilizes augmented data. In our knowledge, this is the first data-driven approach that learns graphical models for planning under partial observability, which can be solved efficiently. S-RSPMNs retain the linear solution complexity of SPMNs, and we demonstrate significant improvements in compactness of representation and the run time of structure learning and inference in sequential domains.

Keywords:

Machine Learning: Learning Graphical Models

Planning and Scheduling: Model-Based Reasoning

Planning and Scheduling: Planning under Uncertainty