Nonlinear Maximum Margin Multi-View Learning with Adaptive Kernel

Jia He; Changying Du; Changde Du; Fuzhen Zhuang; Qing He; Guoping Long

Nonlinear Maximum Margin Multi-View Learning with Adaptive Kernel

Jia He, Changying Du, Changde Du, Fuzhen Zhuang, Qing He, Guoping Long

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence

Main track. Pages 1830-1836. https://doi.org/10.24963/ijcai.2017/254

PDF BibTeX

Existing multi-view learning methods based on kernel function either require the user to select and tune a single predefined kernel or have to compute and store many Gram matrices to perform multiple kernel learning. Apart from the huge consumption of manpower, computation and memory resources, most of these models seek point estimation of their parameters, and are prone to overfitting to small training data. This paper presents an adaptive kernel nonlinear max-margin multi-view learning model under the Bayesian framework. Specifically, we regularize the posterior of an efficient multi-view latent variable model by explicitly mapping the latent representations extracted from multiple data views to a random Fourier feature space where max-margin classification constraints are imposed. Assuming these random features are drawn from Dirichlet process Gaussian mixtures, we can adaptively learn shift-invariant kernels from data according to Bochners theorem. For inference, we employ the data augmentation idea for hinge loss, and design an efficient gradient-based MCMC sampler in the augmented space. Having no need to compute the Gram matrix, our algorithm scales linearly with the size of training set. Extensive experiments on real-world datasets demonstrate that our method has superior performance.

Keywords:

Machine Learning: Classification

Machine Learning: Kernel Methods

Uncertainty in AI: Approximate Probabilistic Inference

Machine Learning: Multi-instance/Multi-label/Multi-view learning