G2Pxy: Generative Open-Set Node Classification on Graphs with Proxy Unknowns

G2Pxy: Generative Open-Set Node Classification on Graphs with Proxy Unknowns

Qin Zhang, Zelin Shi, Xiaolin Zhang, Xiaojun Chen, Philippe Fournier-Viger, Shirui Pan

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 4576-4583. https://doi.org/10.24963/ijcai.2023/509

Node classification is the task of predicting the labels of unlabeled nodes in a graph. State-of-the-art methods based on graph neural networks achieve excellent performance when all labels are available during training. But in real-life, models are of ten applied on data with new classes, which can lead to massive misclassification and thus significantly degrade performance. Hence, developing open-set classification methods is crucial to determine if a given sample belongs to a known class. Existing methods for open-set node classification generally use transductive learning with part or all of the features of real unseen class nodes to help with open-set classification. In this paper, we propose a novel generative open-set node classification method, i.e., G2Pxy, which follows a stricter inductive learning setting where no information about unknown classes is available during training and validation. Two kinds of proxy unknown nodes, inter-class unknown proxies and external unknown proxies are generated via mixup to efficiently anticipate the distribution of novel classes. Using the generated proxies, a closed-set classifier can be transformed into an open-set one, by augmenting it with an extra proxy classifier. Under the constraints of both cross entropy loss and complement entropy loss, G2Pxy achieves superior effectiveness for unknown class detection and known class classification, which is validated by experiments on bench mark graph datasets. Moreover, G2Pxy does not have specific requirement on the GNN architecture and shows good generalizations.
Keywords:
Machine Learning: ML: Classification
Data Mining: DM: Mining graphs