On the Convergence of (Stochastic) Gradient Descent with Extrapolation for Non-Convex Minimization

On the Convergence of (Stochastic) Gradient Descent with Extrapolation for Non-Convex Minimization

Yi Xu, Zhuoning Yuan, Sen Yang, Rong Jin, Tianbao Yang

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 4003-4009. https://doi.org/10.24963/ijcai.2019/556

Extrapolation is a well-known technique for solving convex optimization and variational inequalities and recently attracts some attention for non-convex optimization. Several recent works have empirically shown its success in some machine learning tasks. However, it has not been analyzed for non-convex minimization and there still remains a gap between the theory and the practice. In this paper, we analyze gradient descent  and stochastic gradient descent with extrapolation for finding an approximate first-order stationary point in smooth non-convex optimization problems. Our convergence upper bounds show that the algorithms with extrapolation can be accelerated than without extrapolation.
Keywords:
Machine Learning: Deep Learning
Machine Learning: Probabilistic Machine Learning