Efficient DNN Neuron Pruning by Minimizing Layer-wise Nonlinear Reconstruction Error

Efficient DNN Neuron Pruning by Minimizing Layer-wise Nonlinear Reconstruction Error

Chunhui Jiang, Guiying Li, Chao Qian, Ke Tang

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 2298-2304. https://doi.org/10.24963/ijcai.2018/318

Deep neural networks (DNNs) have achieved great success, but the applications to mobile devices are limited due to their huge model size and low inference speed. Much effort thus has been devoted to pruning DNNs. Layer-wise neuron pruning methods have shown their effectiveness, which minimize the reconstruction error of linear response with a limited number of neurons in each single layer pruning. In this paper, we propose a new layer-wise neuron pruning approach by minimizing the reconstruction error of nonlinear units, which might be more reasonable since the error before and after activation can change significantly. An iterative optimization procedure combining greedy selection with gradient decent is proposed for single layer pruning. Experimental results on benchmark DNN models show the superiority of the proposed approach. Particularly, for VGGNet, the proposed approach can compress its disk space by 13.6× and bring a speedup of 3.7×; for AlexNet, it can achieve a compression rate of 4.1× and a speedup of 2.2×, respectively.
Keywords:
Machine Learning: Neural Networks
Machine Learning: Feature Selection ; Learning Sparse Models
Machine Learning: Deep Learning
Machine Learning Applications: Applications of Supervised Learning