Metadata-dependent Infinite Poisson Factorization for Efficiently Modelling Sparse and Large Matrices in Recommendation

Metadata-dependent Infinite Poisson Factorization for Efficiently Modelling Sparse and Large Matrices in Recommendation

Trong Dinh Thac Do, Longbing Cao

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 5010-5016. https://doi.org/10.24963/ijcai.2018/695

Matrix Factorization (MF) is widely used in Recommender Systems (RSs) for estimating missing ratings in the rating matrix. MF faces major challenges of handling very sparse and large data. Poisson Factorization (PF) as an MF variant addresses these challenges with high efficiency by only computing on those non-missing elements. However, ignoring the missing elements in computation makes PF weak or incapable for dealing with columns or rows with very few observations (corresponding to sparse items or users). In this work, Metadata-dependent Poisson Factorization (MPF) is invented to address the user/item sparsity by integrating user/item metadata into PF. MPF adds the metadata-based observed entries to the factorized PF matrices. In addition, similar to MF, choosing the suitable number of latent components for PF is very expensive on very large datasets. Accordingly, we further extend MPF to Metadata-dependent Infinite Poisson Factorization (MIPF) that integrates Bayesian Nonparametric (BNP) technique to automatically tune the number of latent components. Our empirical results show that, by integrating metadata, MPF/MIPF significantly outperform the state-of-the-art PF models for sparse and large datasets. MIPF also effectively estimates the number of latent components.
Keywords:
Uncertainty in AI: Approximate Probabilistic Inference
Uncertainty in AI: Graphical Models
Machine Learning: Probabilistic Machine Learning
Machine Learning: Recommender Systems