Learning Word Vectors with Linear Constraints: A Matrix Factorization Approach

Learning Word Vectors with Linear Constraints: A Matrix Factorization Approach

Wenye Li, Jiawei Zhang, Jianjun Zhou, Laizhong Cui

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 4187-4193. https://doi.org/10.24963/ijcai.2018/582

Learning vector space representation of words, or word embedding, has attracted much recent research attention. With the objective of better capturing the semantic and syntactic information inherent in words, we propose two new embedding models based on the singular value decomposition of lexical co-occurrences of words. Different from previous work, our proposed models allow for injecting linear constraints when performing the decomposition, with which the desired semantic and syntactic information will be maintained in word vectors. Conceptually the models are flexible and convenient to encode prior knowledge about words. Computationally they can be easily solved by direct matrix factorization. Surprisingly simple yet effective, the proposed models have reported significantly improved performance in empirical word analogy and sentence classification evaluations, and demonstrated high potentials in practical applications.
Keywords:
Natural Language Processing: Natural Language Processing
Machine Learning: Unsupervised Learning