MultiMirror: Neural Cross-lingual Word Alignment for Multilingual Word Sense Disambiguation

Luigi Procopio; Edoardo Barba; Federico Martelli; Roberto Navigli

doi:10.24963/ijcai.2021/539

MultiMirror: Neural Cross-lingual Word Alignment for Multilingual Word Sense Disambiguation

Luigi Procopio, Edoardo Barba, Federico Martelli, Roberto Navigli

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence

Main Track. Pages 3915-3921. https://doi.org/10.24963/ijcai.2021/539

PDF BibTeX

Word Sense Disambiguation (WSD), i.e., the task of assigning senses to words in context, has seen a surge of interest with the advent of neural models and a considerable increase in performance up to 80% F1 in English. However, when considering other languages, the availability of training data is limited, which hampers scaling WSD to many languages. To address this issue, we put forward MultiMirror, a sense projection approach for multilingual WSD based on a novel neural discriminative model for word alignment: given as input a pair of parallel sentences, our model -- trained with a low number of instances -- is capable of jointly aligning, at the same time, all source and target tokens with each other, surpassing its competitors across several language combinations. We demonstrate that projecting senses from English by leveraging the alignments produced by our model leads a simple mBERT-powered classifier to achieve a new state of the art on established WSD datasets in French, German, Italian, Spanish and Japanese. We release our software and all our datasets at https://github.com/SapienzaNLP/multimirror.

Keywords:

Natural Language Processing: Natural Language Semantics

Natural Language Processing: Resources and Evaluation