Adapting BERT for Target-Oriented Multimodal Sentiment Classification

Adapting BERT for Target-Oriented Multimodal Sentiment Classification

Jianfei Yu, Jing Jiang

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 5408-5414. https://doi.org/10.24963/ijcai.2019/751

As an important task in Sentiment Analysis, Target-oriented Sentiment Classification (TSC) aims to identify sentiment polarities over each opinion target in a sentence. However, existing approaches to this task primarily rely on the textual content, but ignoring the other increasingly popular multimodal data sources (e.g., images), which can enhance the robustness of these text-based models. Motivated by this observation and inspired by the recently proposed BERT architecture, we study Target-oriented Multimodal Sentiment Classification (TMSC) and propose a multimodal BERT architecture. To model intra-modality dynamics, we first apply BERT to obtain target-sensitive textual representations. We then borrow the idea from self-attention and design a target attention mechanism to perform target-image matching to derive target-sensitive visual representations. To model inter-modality dynamics, we further propose to stack a set of self-attention layers to capture multimodal interactions. Experimental results show that our model can outperform several highly competitive approaches for TSC and TMSC.
Keywords:
Natural Language Processing: Natural Language Processing
Natural Language Processing: Sentiment Analysis and Text Mining