Mask and Infill: Applying Masked Language Model for Sentiment Transfer

Mask and Infill: Applying Masked Language Model for Sentiment Transfer

Xing Wu, Tao Zhang, Liangjun Zang, Jizhong Han, Songlin Hu

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 5271-5277. https://doi.org/10.24963/ijcai.2019/732

This paper focuses on the task of sentiment transfer on non-parallel text, which modifies sentiment attributes (e.g., positive or negative) of sentences while preserving their attribute-independent contents. Existing methods adopt RNN encoder-decoder structure to generate a new sentence of a target sentiment word by word, which is trained on a particular dataset from scratch and have limited ability to produce satisfactory sentences. When people convert the sentiment attribute of a given sentence, a simple but effective approach is to only replace the sentiment tokens of the sentence with other expressions indicative of the target sentiment, instead of building a new sentence from scratch. Such a process is very similar to the task of Text Infilling or Cloze. With this intuition, we propose a two steps approach: Mask and Infill. In the \emph{mask} step, we identify and mask the sentiment tokens of a given sentence. In the \emph{infill} step, we utilize a pre-trained Masked Language Model (MLM) to infill the masked positions by predicting words or phrases conditioned on the context\footnote{In this paper, \emph{content} and \emph{context} are equivalent, \emph{style}, \emph{attribute} and \emph{label} are equivalent.}and target sentiment. We evaluate our model on two review datasets \emph{Yelp} and \emph{Amazon} by quantitative, qualitative, and human evaluations. Experimental results demonstrate that our model achieve state-of-the-art performance on both accuracy and BLEU scores.
Keywords:
Natural Language Processing: Natural Language Generation
Natural Language Processing: Sentiment Analysis and Text Mining
Machine Learning Applications: Applications of Unsupervised Learning