Discriminative Learning of Beam-Search Heuristics for Planning

Yuehua Xu, Alan Fern,Sungwook Yoon

We consider the problem of learning heuristics for controlling forward state-space beam search in AI planning domains. We draw on a recent framework for "structured output classification" (e.g. syntactic parsing) known as learning as search optimization (LaSO). The LaSO approach uses discriminative learning to optimize heuristic functions for search-based computation of structured outputs and has shown promising results in a number of domains. However, the search problems that arise in AI planning tend to be qualitatively very different from those considered in structured classification, which raises a number of potential difficulties in directly applying LaSO to planning. In this paper, we discuss these issues and describe a LaSO-based approach for discriminative learning of beam-search heuristics in AI planning domains. We give convergence results for this approach and present experiments in several benchmark domains. The results show that the discriminatively trained heuristic can outperform the one used by the planner FF and another recent non-discriminative learning approach.