Approximability of Constant-horizon Constrained POMDP

Approximability of Constant-horizon Constrained POMDP

Majid Khonji, Ashkan Jasour, Brian Williams

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 5583-5590. https://doi.org/10.24963/ijcai.2019/775

Partially Observable Markov Decision Process (POMDP) is a fundamental framework for planning and decision making under uncertainty. POMDP is known to be intractable to solve or even approximate when the planning horizon is long (i.e., within a polynomial number of time steps). Constrained POMDP (C-POMDP) allows constraints to be specified on some aspects of the policy in addition to the objective function. When the constraints involve bounding the probability of failure, the problem is called Chance-Constrained POMDP (CC-POMDP). Our first contribution is a reduction from CC-POMDP to C-POMDP and a novel Integer Linear Programming (ILP) formulation. Thus, any algorithm for the later problem can be utilized to solve any instance of the former. Second, we show that unlike POMDP, when the length of the planning horizon is constant, (C)C-POMDP is NP-Hard. Third, we present the first Fully Polynomial Time Approximation Scheme (FPTAS) that computes (near) optimal deterministic policies for constant-horizon (C)C-POMDP in polynomial time.
Keywords:
Planning and Scheduling: POMDPs
Planning and Scheduling: Planning under Uncertainty
Uncertainty in AI: Markov Decision Processes