Discovering Relevance-Dependent Bicluster Structure from Relational Data

Discovering Relevance-Dependent Bicluster Structure from Relational Data

Iku Ohama, Takuya Kida, Hiroki Arimura

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 2578-2584. https://doi.org/10.24963/ijcai.2017/359

In this paper, we propose a statistical model for relevance-dependent biclustering to analyze relational data. The proposed model factorizes relational data into bicluster structure with two features: (1) each object in a cluster has a relevance value, which indicates how strongly the object relates to the cluster and (2) all clusters are related to at least one dense block. These features simplify the task of understanding the meaning of each cluster because only a few highly relevant objects need to be inspected. We introduced the Relevance-Dependent Bernoulli Distribution (R-BD) as a prior for relevance-dependent binary matrices and proposed the novel Relevance-Dependent Infinite Biclustering (R-IB) model, which automatically estimates the number of clusters. Posterior inference can be performed efficiently using a collapsed Gibbs sampler because the parameters of the R-IB model can be fully marginalized out. Experimental results show that the R-IB extracts more essential bicluster structure with better computational efficiency than conventional models. We further observed that the biclustering results obtained by R-IB facilitate interpretation of the meaning of each cluster.
Keywords:
Machine Learning: Data Mining
Machine Learning: Learning Graphical Models
Machine Learning: Relational Learning
Machine Learning: Unsupervised Learning