Imbalanced Node Classification Beyond Homophilic Assumption

Imbalanced Node Classification Beyond Homophilic Assumption

Jie Liu, Mengting He, Guangtao Wang, Quoc Viet Hung Nguyen, Xuequn Shang, Hongzhi Yin

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 7206-7214. https://doi.org/10.24963/ijcai.2023/848

Imbalanced node classification widely exists in real-world networks where graph neural networks (GNNs) are usually highly inclined to majority classes and suffer from severe performance degradation on classifying minority class nodes. Various imbalanced node classification methods have been proposed recently which construct synthetic nodes and edges w.r.t. minority classes to balance the label/topology distribution. However, they are all based on homophilic assumption that nodes of the same label tend to connect despite the widely existence of heterophilic edges in real-world graphs. Thus, they uniformly aggregate features from both homophilic and heterophilic neighbors and rely on feature similarity to generate synthetic edges, which cannot be applied to imbalanced graphs in high heterophily. To address this problem, we propose a novel GraphSANN for imbalanced node classification on both homophilic and heterophilic graphs. Firstly, we propose a unified feature mixer to generate synthetic nodes with both homophilic and heterophilic interpolation in a unified way. Next, by randomly sampling edges between synthetic nodes and existing nodes as candidata edges, we design an adaptive subgraph extractor to dynamically extract the contextual subgraphs of candidate edges with flexible ranges. Finally, we develop a multi-filter subgraph encoder which constructs multiple different filter channels to discriminatively aggregate neighbors’ information along the homophilic and heterophilic edges. Extensive experiments on eight benchmark datasets demonstrate the superiority of our model for imbalanced node classificaiton on both homophilic and heterophilic graphs.
Keywords:
Data Mining: DM: Mining graphs
Data Mining: DM: Class imbalance and unequal cost
Machine Learning: ML: Learning graphical models