Toward Estimating Others' Transition Models Under Occlusion for Multi-Robot IRL / 1867
Kenneth Bogert, Prashant Doshi
Multi-robot inverse reinforcement learning (mIRL) is broadly useful for learning, from observations, the behaviors of multiple robots executing fixed trajectories and interacting with each other. In this paper, we relax a crucial assumption in IRL to make it better suited for wider robotic applications: we allow the transition functions of other robots to be stochastic and do not assume that the transition error probabilities are known to the learner. Challenged by occlusion where large portions of others' state spaces are fully hidden, we present a new approach that maps stochastic transitions to distributions over features. Then, the underconstrained problem is solved using nonlinear optimization that maximizes entropy to learn the transition function of each robot from occluded observations. Our methods represent significant and first steps toward making mIRL pragmatic.