EigenNet: Towards Fast and Structural Learning of Deep Neural Networks
EigenNet: Towards Fast and Structural Learning of Deep Neural Networks
Ping Luo
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 2428-2434.
https://doi.org/10.24963/ijcai.2017/338
Deep Neural Network (DNN) is difficult to train and easy to overfit in training. We address these two issues by introducing EigenNet, an architecture that not only accelerates training but also adjusts number of hidden neurons to reduce over-fitting. They are achieved by whitening the information flows of DNNs and removing those eigenvectors that may capture noises. The former improves conditioning of the Fisher information matrix, whilst the latter increases generalization capability. These appealing properties of EigenNet can benefit many recent DNN structures, such as network in network and inception, by wrapping their hidden layers into the layers of EigenNet. The modeling capacities of the original networks are preserved. Both the training wall-clock time and number of updates are reduced by using EigenNet, compared to stochastic gradient descent on various datasets, including MNIST, CIFAR-10, and CIFAR-100.
Keywords:
Machine Learning: Deep Learning
Robotics and Vision: Vision and Perception