Variational Learning of Bayesian Neural Networks via Bayesian Dark Knowledge
Variational Learning of Bayesian Neural Networks via Bayesian Dark Knowledge
Gehui Shen, Xi Chen, Zhihong Deng
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Main track. Pages 2037-2043.
https://doi.org/10.24963/ijcai.2020/282
Bayesian neural networks (BNNs) have received more and more attention because they are capable of modeling epistemic uncertainty which is hard for conventional neural networks. Markov chain Monte Carlo (MCMC) methods and variational inference (VI) are two mainstream methods for Bayesian deep learning. The former is effective but its storage cost is prohibitive since it has to save many samples of neural network parameters. The latter method is more time and space efficient, however the approximate variational posterior limits its performance. In this paper, we aim to combine the advantages of above two methods by distilling MCMC samples into an approximate variational posterior. On the basis of an existing distillation technique we first propose variational Bayesian dark knowledge method. Moreover, we propose Bayesian dark prior knowledge, a novel distillation method which considers MCMC posterior as the prior of a variational BNN. Two proposed methods both not only can reduce the space overhead of the teacher model so that are scalable, but also maintain a distilled posterior distribution capable of modeling epistemic uncertainty. Experimental results manifest our methods outperform existing distillation method in terms of predictive accuracy and uncertainty modeling.
Keywords:
Machine Learning: Probabilistic Machine Learning
Uncertainty in AI: Approximate Probabilistic Inference
Machine Learning: Deep Learning