Efficient Convex Optimization Requires Superlinear Memory (Extended Abstract)

Efficient Convex Optimization Requires Superlinear Memory (Extended Abstract)

Annie Marsden, Vatsal Sharan, Aaron Sidford, Gregory Valiant

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Sister Conferences Best Papers. Pages 6468-6473. https://doi.org/10.24963/ijcai.2023/722

Minimizing a convex function with access to a first order oracle---that returns the function evaluation and (sub)gradient at a query point---is a canonical optimization problem and a fundamental primitive in machine learning. Gradient-based methods are the most popular approaches used for solving the problem, owing to their simplicity and computational efficiency. These methods, however, do not achieve the information-theoretically optimal query complexity for minimizing the underlying function to small error, which are achieved by more expensive techniques based on cutting-plane methods. Is it possible to achieve the information-theoretically query complexity without using these more complex and computationally expensive methods? In this work, we use memory as a lens to understand this, and show that is is not possible to achieve optimal query complexity without using significantly more memory than that used by gradient descent.
Keywords:
Sister Conferences Best Papers: Machine Learning