Utile Distinctions for Relational Reinforcement Learning

William Dabney, Amy McGovern

We introduce an approach to autonomously creating state space abstractions for an online reinforcement learning agent using a relational representation. Our approach uses a tree-based function approximation derived from McCallum's [1995] UTree algorithm. We have extended this approach to use a relational representation where relational observations are represented by attributed graphs [McGovern et al., 2003]. We address the challenges introduced by a relational representation by using stochastic sampling to manage the search space [Srinivasan, 1999] and temporal sampling to manage autocorrelation [Jensen and Neville, 2002]. Relational UTree incorporates Iterative Tree Induction [Utgoff et al., 1997] to allow it to adapt to changing environments. We empirically demonstrate that Relational UTree performs better than similar relational learning methods [Finney et al., 2002; Driessens et al., 2001] in a blocks world domain. We also demonstrate that Relational UTree can learn to play a sub-task of the game of Go called Tsume-Go [Ramon et al., 2001].