Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification (Extended Abstract)

Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification (Extended Abstract)

Zhenpeng Chen, Sheng Shen, Ziniu Hu, Xuan Lu, Qiaozhu Mei, Xuanzhe Liu

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Sister Conferences Best Papers. Pages 4701-4705. https://doi.org/10.24963/ijcai.2020/649

Sentiment classification typically relies on a large amount of labeled data. In practice, the availability of labels is highly imbalanced among different languages. To tackle this problem, cross-lingual sentiment classification approaches aim to transfer knowledge learned from one language that has abundant labeled examples (i.e., the source language, usually English) to another language with fewer labels (i.e., the target language). The source and the target languages are usually bridged through off-the-shelf machine translation tools. Through such a channel, cross-language sentiment patterns can be successfully learned from English and transferred into the target languages. This approach, however, often fails to capture sentiment knowledge specific to the target language. In this paper, we employ emojis, which are widely available in many languages, as a new channel to learn both the cross-language and the language-specific sentiment patterns. We propose a novel representation learning method that uses emoji prediction as an instrument to learn respective sentiment-aware representations for each language. The learned representations are then integrated to facilitate cross-lingual sentiment classification.
Keywords:
Natural Language Processing: Text Classification
Data Mining: Mining Text, Web, Social Media