Method of Moments for Topic Models with Mixed Discrete and Continuous Features

Joachim Giesen; Paul Kahlmeyer; Sören Laue; Matthias Mitterreiter; Frank Nussbaum; Christoph Staudt; Sina Zarrieß

doi:10.24963/ijcai.2021/333

Method of Moments for Topic Models with Mixed Discrete and Continuous Features

Joachim Giesen, Paul Kahlmeyer, Sören Laue, Matthias Mitterreiter, Frank Nussbaum, Christoph Staudt, Sina Zarrieß

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence

Main Track. Pages 2418-2424. https://doi.org/10.24963/ijcai.2021/333

PDF BibTeX

Topic models are characterized by a latent class variable that represents the different topics. Traditionally, their observable variables are modeled as discrete variables like, for instance, in the prototypical latent Dirichlet allocation (LDA) topic model. In LDA, words in text documents are encoded by discrete count vectors with respect to some dictionary. The classical approach for learning topic models optimizes a likelihood function that is non-concave due to the presence of the latent variable. Hence, this approach mostly boils down to using search heuristics like the EM algorithm for parameter estimation. Recently, it was shown that topic models can be learned with strong algorithmic and statistical guarantees through Pearson's method of moments. Here, we extend this line of work to topic models that feature discrete as well as continuous observable variables (features). Moving beyond discrete variables as in LDA allows for more sophisticated features and a natural extension of topic models to other modalities than text, like, for instance, images. We provide algorithmic and statistical guarantees for the method of moments applied to the extended topic model that we corroborate experimentally on synthetic data. We also demonstrate the applicability of our model on real-world document data with embedded images that we preprocess into continuous state-of-the-art feature vectors.

Keywords:

Machine Learning: Learning Generative Models

Machine Learning: Probabilistic Machine Learning

Machine Learning: Unsupervised Learning