Communication-Efficient Stochastic Gradient Descent Ascent with Momentum Algorithms

Yihan Zhang; Meikang Qiu; Hongchang Gao

doi:10.24963/ijcai.2023/512

Communication-Efficient Stochastic Gradient Descent Ascent with Momentum Algorithms

Yihan Zhang, Meikang Qiu, Hongchang Gao

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence

Main Track. Pages 4602-4610. https://doi.org/10.24963/ijcai.2023/512

PDF BibTeX

Numerous machine learning models can be formulated as a stochastic minimax optimization problem, such as imbalanced data classification with AUC maximization. Developing efficient algorithms to optimize such kinds of problems is of importance and necessity. However, most existing algorithms restrict their focus on the single-machine setting so that they are incapable of dealing with the large communication overhead in a distributed training system. Moreover, most existing communication-efficient optimization algorithms only focus on the traditional minimization problem, failing to handle the minimax optimization problem. To address these challenging issues, in this paper, we develop two novel communication-efficient stochastic gradient descent ascent with momentum algorithms for the distributed minimax optimization problem, which can significantly reduce the communication cost via the two-way compression scheme. However, the compressed momentum makes it considerably challenging to investigate the convergence rate of our algorithms, especially in the presence of the interaction between the minimization and maximization subproblems. In this paper, we successfully addressed these challenges and established the convergence rate of our algorithms for nonconvex-strongly-concave problems. To the best of our knowledge, our algorithms are the first communication-efficient algorithm with theoretical guarantees for the minimax optimization problem. Finally, we apply our algorithm to the distributed AUC maximization problem for the imbalanced data classification task. Extensive experimental results confirm the efficacy of our algorithm in saving communication costs.

Keywords:

Machine Learning: ML: Optimization

Machine Learning: ML: Federated learning

Data Mining: DM: Parallel, distributed and cloud-based high performance mining