Multi-Vector Embedding on Networks with Taxonomies
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
Main Track. Pages 2944-2950.
https://doi.org/10.24963/ijcai.2022/408
A network can effectively depict close relationships among its nodes, with labels in a taxonomy describing the nodes' rich attributes. Network embedding aims at learning a representation vector for each node and label to preserve their proximity, while most existing methods suffer from serious underfitting when dealing with datasets with dense node-label links. For instance, a node could have dozens of labels describing its diverse properties, causing the single node vector overloaded and hard to fit all the labels. We propose HIerarchical Multi-vector Embedding (HIME), which solves the underfitting problem by adaptively learning multiple 'branch vectors' for each node to dynamically fit separate sets of labels in a hierarchy-aware embedding space. Moreover, a 'root vector' is learned for each node based on its branch vectors to better predict the sparse but valuable node-node links with the knowledge of its labels. Experiments reveal HIME’s comprehensive advantages over existing methods on tasks such as proximity search, link prediction and hierarchical classification.
Keywords:
Machine Learning: Representation learning
Data Mining: Mining Heterogenous Data
Data Mining: Networks
Machine Learning: Multi-label
Multidisciplinary Topics and Applications: Bioinformatics