Stability and Generalization for Stochastic (Compositional) Optimizations

Stability and Generalization for Stochastic (Compositional) Optimizations

Xiaokang Pan, Jin Liu, Hulin Kuang, Youqi Li, Lixing Chen, Zhe Qu

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Main Track. Pages 6039-6047. https://doi.org/10.24963/ijcai.2025/672

The use of estimators instead of stochastic gradients for updates has been shown to improve algorithm convergence rates of, but their impact on generalization remains under-explored. In this paper, we investigate how estimators influence generalization. Our focus is on two widely studied problems: stochastic optimization (SO) and stochastic compositional optimization (SCO), both under convex and non-convex settings. For SO problems, we first analyze the generalization error of the STORM algorithm as a foundational step. We then extend our analysis to SCO problems by introducing an algorithmic framework that encompasses several popular algorithmic approaches. Through this framework, we conduct a generalization analysis, uncovering new insights into the impact of estimators on generalization. Subsequently, we provide a detailed analysis of three specific algorithms within this framework: SCGD, SCSC, and COVER, to explore the effects of different estimator strategies. Furthermore, in the context of SCO, we propose a novel definition of stability and a new decomposition of excess risk in the non-convex setting. Our analysis indicates two key findings: (1) In SCO problems, eliminating the estimator for the gradient of the inner function does not impact generalization performance while significantly reducing computational and storage overhead. (2) Faster convergence rates are consistently associated with better generalization performance.
Keywords:
Machine Learning: ML: Learning theory