Learning Sparse Neural Networks for Better Generalization

Learning Sparse Neural Networks for Better Generalization

Shiwei Liu

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Doctoral Consortium. Pages 5190-5191. https://doi.org/10.24963/ijcai.2020/735

Deep neural networks perform well on test data when they are highly overparameterized, which, however, also leads to large cost to train and deploy them. As a leading approach to address this problem, sparse neural networks have been widely used to significantly reduce the size of networks, making them more efficient during training and deployment, without compromising performance. Recently, sparse neural networks, either compressed from a pre-trained model or obtained by training from scratch, have been observed to be able to generalize as well as or even better than their dense counterparts. However, conventional techniques to find well fitted sparse sub-networks are expensive and the mechanisms underlying this phenomenon are far from clear. To tackle these problems, this Ph.D. research aims to study the generalization of sparse neural networks, and to propose more efficient approaches that can yield sparse neural networks with generalization bounds.
Keywords:
Machine Learning: Feature Selection; Learning Sparse Models
Machine Learning: Cost-Sensitive Learning
Machine Learning: Deep Learning