Heterogeneous Graph Information Bottleneck

Liang Yang; Fan Wu; Zichen Zheng; Bingxin Niu; Junhua Gu; Chuan Wang; Xiaochun Cao; Yuanfang Guo

doi:10.24963/ijcai.2021/226

Heterogeneous Graph Information Bottleneck

Liang Yang, Fan Wu, Zichen Zheng, Bingxin Niu, Junhua Gu, Chuan Wang, Xiaochun Cao, Yuanfang Guo

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence

Main Track. Pages 1638-1645. https://doi.org/10.24963/ijcai.2021/226

PDF BibTeX

Most attempts on extending Graph Neural Networks (GNNs) to Heterogeneous Information Networks (HINs) implicitly take the direct assumption that the multiple homogeneous attributed networks induced by different meta-paths are complementary. The doubts about the hypothesis of complementary motivate an alternative assumption of consensus. That is, the aggregated node attributes shared by multiple homogeneous attributed networks are essential for node representations, while the specific ones in each homogeneous attributed network should be discarded. In this paper, a novel Heterogeneous Graph Information Bottleneck (HGIB) is proposed to implement the consensus hypothesis in an unsupervised manner. To this end, information bottleneck (IB) is extended to unsupervised representation learning by leveraging self-supervision strategy. Specifically, HGIB simultaneously maximizes the mutual information between one homogeneous network and the representation learned from another homogeneous network, while minimizes the mutual information between the specific information contained in one homogeneous network and the representation learned from this homogeneous network. Model analysis reveals that the two extreme cases of HGIB correspond to the supervised heterogeneous GNN and the infomax on homogeneous graph, respectively. Extensive experiments on real datasets demonstrate that the consensus-based unsupervised HGIB significantly outperforms most semi-supervised SOTA methods based on complementary assumption.

Keywords:

Data Mining: Mining Graphs, Semi Structured Data, Complex Data