Accelerated Doubly Stochastic Gradient Algorithm for Large-scale Empirical Risk Minimization

Accelerated Doubly Stochastic Gradient Algorithm for Large-scale Empirical Risk Minimization

Zebang Shen, Hui Qian, Tongzhou Mu, Chao Zhang

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 2715-2721. https://doi.org/10.24963/ijcai.2017/378

Nowadays, algorithms with fast convergence, small memory footprints, and low per-iteration complexity are particularly favorable for artificial intelligence applications. In this paper, we propose a doubly stochastic algorithm with a novel accelerating multi-momentum technique to solve large scale empirical risk minimization problem for learning tasks. While enjoying a provably superior convergence rate, in each iteration, such algorithm only accesses a mini batch of samples and meanwhile updates a small block of variable coordinates, which substantially reduces the amount of memory reference when both the massive sample size and ultra-high dimensionality are involved. Specifically, to obtain an ε-accurate solution, our algorithm requires only O(log(1/ε)/sqrt(ε)) overall computation for the general convex case and O((n+sqrt{nκ})log(1/ε)) for the strongly convex case. Empirical studies on huge scale datasets are conducted to illustrate the efficiency of our method in practice.
Keywords:
Machine Learning: Classification
Machine Learning: Machine Learning