Hierarchical Heuristic Forward Search in Stochastic Domains

Nicolas Meuleau, Ronen Brafman

Many MDPs exhibit an hierarchical structure where the agent needs to perform various subtasks that are coupled only by a small sub-set of variables containing, notably, shared resources. Previous work has shown how this hierarchical structure can be exploited by solving several sub-MDPs representing the different subtasks in different calling contexts, and a root MDP responsible for sequencing and synchronizing the subtasks, instead of a huge MDP representing the whole problem. Another important idea used by efficient algorithms for solving flat MDPs, such as (L)AO* and (L)RTDP, is to exploit reachability information and an admissible heuristics in order to accelerate the search by pruning states that cannot be reached from a given starting state under an optimal policy. In this paper, we combine both ideas and develop a variant of the AO* algorithm for performing forward heuristic search in hierarchical models. This algorithm shows great performance improvements over hierarchical approaches using standard MDP solvers such as Value Iteration, as well as with respect to AO* applied to a flat representation of the problem. Moreover, it presents a general new method for accelerating AO* and other forward search algorithms. Substantial performance gains may be obtained in these algorithms by partitioning the set of search nodes, and solving a subset of nodes completely before propagating the results to other subsets.