Reinforcement Learning for Turn-Taking Management in Incremental Spoken Dialogue Systems / 2831
Hatim Khouzaimi, Romain Laroche, Fabrice Lefèvre
In this article, reinforcement learning is used to learn an optimal turn-taking strategy for vocal human-machine dialogue. The Orange Labs' Majordomo dialogue system, which allows the users to have conversations within a smart home, has been upgraded to an incremental version. First, a user simulator is built in order to generate a dialogue corpus which thereafter is used to optimise the turn-taking strategy from delayed rewards with the Fitted-Q reinforcement learning algorithm. Real users test and evaluate the new learnt strategy, versus a non-incremental and a handcrafted incremental strategies. The data-driven strategy is shown to significantly improve the task completion ratio and to be preferred by the users according to subjective metrics.