Video Summarization via Label Distributions Dual-Reward

Video Summarization via Label Distributions Dual-Reward

Yongbiao Gao, Ning Xu, Xin Geng

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
Main Track. Pages 2403-2409. https://doi.org/10.24963/ijcai.2021/331

Reinforcement learning maps from perceived state representation to actions, which is adopted to solve the video summarization problem. The reward is crucial for deal with the video summarization task via reinforcement learning, since the reward signal defines the goal of video summarization. However, existing reward mechanism in reinforcement learning cannot handle the ambiguity which appears frequently in video summarization, i.e., the diverse consciousness by different people on the same video. To solve this problem, in this paper label distributions are mapped from the CNN and LSTM-based state representation to capture the subjectiveness of video summaries. The dual-reward is designed by measuring the similarity between user score distributions and the generated label distributions. Not only the average score but also the the variance of the subjective opinions are considered in summary generation. Experimental results on several benchmark datasets show that our proposed method outperforms other approaches under various settings.
Keywords:
Machine Learning: Multi-instance; Multi-label; Multi-view learning
Machine Learning Applications: Applications of Reinforcement Learning