Large Margin Boltzmann Machines

Boltzmann Machines are a powerful class of undirected graphical models. Originally proposed as artificial neural networks, they can be regarded as a type of Markov Random Field in which the connection weights between nodes are symmetric and learned from data. They are also closely related to recent models such as Markov logic networks and Conditional Random Fields. A major challenge for Boltzmann machines (as well as other graphical models) is speeding up learning for large-scale problems. The heart of the problem lies in efficiently and effectively approximating the partition function. In this paper, we propose a new efficient learning algorithm for Boltzmann machines that allows them to be applied to problems with large numbers of random variables. We introduce a new large-margin variational approximation to the partition function that allows Boltzmann machines to be trained using a support vector machine (SVM) style learning algorithm. For discriminative learning tasks, these large margin Boltzmann machines provide an alternative approach to structural SVMs. We show that these machines have low sample complexity and derive a generalization bound. Our results demonstrate that on multi-label classification problems, large margin Boltzmann machines achieve orders of magnitude faster performance than structural SVMs and also outperform structural SVMs on problems with large numbers of labels.

Xu Miao, Rajesh P. N. Rao