Reinforced Negative Sampling for Recommendation with Exposure Data

Reinforced Negative Sampling for Recommendation with Exposure Data

Jingtao Ding, Yuhan Quan, Xiangnan He, Yong Li, Depeng Jin

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 2230-2236. https://doi.org/10.24963/ijcai.2019/309

In implicit feedback-based recommender systems, user exposure data, which record whether or not a recommended item has been interacted by a user, provide an important clue on selecting negative training samples. In this work, we improve the negative sampler by integrating the exposure data. We propose to generate high-quality negative instances by adversarial training to favour the dif´Čücult instances, and by optimizing additional objective to favour the real negatives in exposure data. However, this idea is non-trivial to implement since the distribution of exposure data is latent and the item space is discrete. To this end, we design a novel RNS method (short for Reinforced Negative Sampler) that generates exposure-alike negative instances through feature matching technique instead of directly choosing from exposure data. Optimized under the reinforcement learning framework, RNS is able to integrate user preference signals in exposure data and hard negatives. Extensive experiments on two real-world datasets demonstrate the effectiveness and rationality of our RNS method. Our implementation is available at: https://github. com/dingjingtao/ReinforceNS.
Keywords:
Machine Learning: Learning Preferences or Rankings
Machine Learning: Recommender Systems