α-min: A Compact Approximate Solver For Finite-Horizon POMDPs / 2582
Yann Dujardin, Tom Dietterich, Iadine Chades
In many POMDP applications in computational sustainability, it is important that the computed policy have a simple description, so that it can be easily interpreted by stakeholders and decision makers. One measure of simplicity for POMDP value functions is the number of alpha-vectors required to represent the value function. Existing POMDP methods seek to optimize the accuracy of the value function, which can require a very large number of alpha-vectors. This paper studies methods that allow the user to explore the tradeoff between the accuracy of the value function and the number of alpha-vectors. Building on previous point-based POMDP solvers, this paper introduces a new algorithm (alpha-min) that formulates a Mixed Integer Linear Program (MILP) to calculate approximate solutions for finite-horizon POMDP problems with limited numbers of alpha-vectors. At each time-step, alpha-min calculates alpha-vectors to greedily minimize the gap between current upper and lower bounds of the value function. In doing so, good upper and lower bounds are quickly reached allowing a good approximation of the problem with few alpha-vectors. Experimental results show that alpha-min provides good approximate solutions given a fixed number of alpha-vectors on small benchmark problems, on a larger randomly generated problem, as well as on a computational sustainability problem to best manage the endangered Sumatran tiger.