Steady-State Policy Synthesis for Verifiable Control

Steady-State Policy Synthesis for Verifiable Control

Alvaro Velasquez

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 5653-5661. https://doi.org/10.24963/ijcai.2019/784

In this paper, we introduce the Steady-State Policy Synthesis (SSPS) problem which consists of finding a stochastic decision-making policy that maximizes expected rewards while satisfying a set of asymptotic behavioral specifications. These specifications are determined by the steady-state probability distribution resulting from the Markov chain induced by a given policy. Since such distributions necessitate recurrence, we propose a solution which finds policies that induce recurrent Markov chains within possibly non-recurrent Markov Decision Processes (MDPs). The SSPS problem functions as a generalization of steady-state control, which has been shown to be in PSPACE. We improve upon this result by showing that SSPS is in P via linear programming. Our results are validated using CPLEX simulations on MDPs with over 10000 states. We also prove that the deterministic variant of SSPS is NP-hard.
Keywords:
Planning and Scheduling: Planning under Uncertainty
Agent-based and Multi-agent Systems: Formal Verification, Validation and Synthesis
Planning and Scheduling: Model-Based Reasoning
Planning and Scheduling: Markov Decisions Processes
Uncertainty in AI: Markov Decision Processes