Refining Word Representations by Manifold Learning

Refining Word Representations by Manifold Learning

Chu Yonghe, Hongfei Lin, Liang Yang, Yufeng Diao, Shaowu Zhang, Fan Xiaochao

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 5394-5400. https://doi.org/10.24963/ijcai.2019/749

Pre-trained distributed word representations have been proven useful in various natural language processing (NLP) tasks. However, the effect of words’ geometric structure on word representations has not been carefully studied yet. The existing word representations methods underestimate the words whose distances are close in the Euclidean space, while overestimating words with a much greater distance. In this paper, we propose a word vector refinement model to correct the pre-trained word embedding, which brings the similarity of words in Euclidean space closer to word semantics by using manifold learning. This approach is theoretically founded in the metric recovery paradigm. Our word representations have been evaluated on a variety of lexical-level intrinsic tasks (semantic relatedness, semantic similarity) and the experimental results show that the proposed model outperforms several popular word representations approaches.
Keywords:
Natural Language Processing: Natural Language Processing
Natural Language Processing: NLP Applications and Tools