Steady-State Policy Synthesis for Verifiable Control

Alvaro Velasquez

Steady-State Policy Synthesis for Verifiable Control

Alvaro Velasquez

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence

Main track. Pages 5653-5661. https://doi.org/10.24963/ijcai.2019/784

PDF BibTeX

In this paper, we introduce the Steady-State Policy Synthesis (SSPS) problem which consists of ﬁnding a stochastic decision-making policy that maximizes expected rewards while satisfying a set of asymptotic behavioral speciﬁcations. These speciﬁcations are determined by the steady-state probability distribution resulting from the Markov chain induced by a given policy. Since such distributions necessitate recurrence, we propose a solution which ﬁnds policies that induce recurrent Markov chains within possibly non-recurrent Markov Decision Processes (MDPs). The SSPS problem functions as a generalization of steady-state control, which has been shown to be in PSPACE. We improve upon this result by showing that SSPS is in P via linear programming. Our results are validated using CPLEX simulations on MDPs with over 10000 states. We also prove that the deterministic variant of SSPS is NP-hard.

Keywords:

Planning and Scheduling: Planning under Uncertainty

Agent-based and Multi-agent Systems: Formal Verification, Validation and Synthesis

Planning and Scheduling: Model-Based Reasoning

Planning and Scheduling: Markov Decisions Processes

Uncertainty in AI: Markov Decision Processes