Abstract

Utile Distinctions for Relational Reinforcement Learning

Utile Distinctions for Relational Reinforcement Learning

William Dabney, Amy McGovern

We introduce an approach to autonomously creating state space abstractions for an online reinforcement learning agent using a relational representation. Our approach uses a tree-based function approximation derived from McCallum's [1995] UTree algorithm. We have extended this approach to use a relational representation where relational observations are represented by attributed graphs [McGovern et al., 2003]. We address the challenges introduced by a relational representation by using stochastic sampling to manage the search space [Srinivasan, 1999] and temporal sampling to manage autocorrelation [Jensen and Neville, 2002]. Relational UTree incorporates Iterative Tree Induction [Utgoff et al., 1997] to allow it to adapt to changing environments. We empirically demonstrate that Relational UTree performs better than similar relational learning methods [Finney et al., 2002; Driessens et al., 2001] in a blocks world domain. We also demonstrate that Relational UTree can learn to play a sub-task of the game of Go called Tsume-Go [Ramon et al., 2001].