Sparse Probabilistic Matrix Factorization by Laplace Distribution for Collaborative Filtering / 1771
Liping Jing, Peng Wang, Liu Yang
In recommendation systems, probabilistic matrix factorization (PMF) is a state-of-the-art collaborative filtering method by determining the latent features to represent users and items. However, two major issues limiting the usefulness of PMF are the sparsity problem and long-tail distribution. Sparsity refers to the situation that the observed rating data are sparse, which results in that only part of latent features are informative for describing each item/user. Long tail distribution implies that a large fraction of items have few ratings. In this work, we propose a sparse probabilistic matrix factorization method (SPMF) by utilizing a Laplacian distribution to model the item/user factor vector. Laplacian distribution has ability to generate sparse coding, which is beneficial for SPMF to distinguish the relevant and irrelevant latent features with respect to each item/user. Meanwhile, the tails in Laplacian distribution are comparatively heavy, which is rewarding for SPMF to recommend the tail items. Furthermore, a distributed Gibbs sampling algorithm is developed to efficiently train the proposed sparse probabilistic model. A series of experiments on Netfilix and Movielens datasets have been conducted to demonstrate that SPMF outperforms the existing PMF and its extended version Bayesian PMF (BPMF), especially for the recommendation of tail items.