Dependency Exploitation: A Unified CNN-RNN Approach for Visual Emotion Recognition

Dependency Exploitation: A Unified CNN-RNN Approach for Visual Emotion Recognition

Xinge Zhu, Liang Li, Weigang Zhang, Tianrong Rao, Min Xu, Qingming Huang, Dong Xu

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 3595-3601. https://doi.org/10.24963/ijcai.2017/503

Visual emotion recognition aims to associate images with appropriate emotions. There are different visual stimuli that can affect human emotion from low-level to high-level, such as color, texture, part, object, etc. However, most existing methods treat different levels of features as independent entity without having effective method for feature fusion. In this paper, we propose a unified CNN-RNN model to predict the emotion based on the fused features from different levels by exploiting the dependency among them. Our proposed architecture leverages convolutional neural network (CNN) with multiple layers to extract different levels of features with in a multi-task learning framework, in which two related loss functions are introduced to learn the feature representation. Considering the dependencies within the low-level and high-level features, a new bidirectional recurrent neural network (RNN) is proposed to integrate the learned features from different layers in the CNN model. Extensive experiments on both Internet images and art photo datasets demonstrate that our method outperforms the state-of-the-art methods with at least 7% performance improvement.
Keywords:
Machine Learning: Classification
Machine Learning: Transfer, Adaptation, Multi-task Learning
Machine Learning: Deep Learning
Robotics and Vision: Vision and Perception