Quadruply Stochastic Gradients for Large Scale Nonlinear Semi-Supervised AUC Optimization

Quadruply Stochastic Gradients for Large Scale Nonlinear Semi-Supervised AUC Optimization

Wanli Shi, Bin Gu, Xiang Li, Xiang Geng, Heng Huang

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 3418-3424. https://doi.org/10.24963/ijcai.2019/474

Semi-supervised learning is pervasive in real-world applications, where only a few labeled data are available and large amounts of instances remain unlabeled. Since AUC is an important model evaluation metric in classification, directly optimizing AUC in semi-supervised learning scenario has drawn much attention in the machine learning community. Recently, it has been shown that one could find an unbiased solution for the semi-supervised AUC maximization problem without knowing the class prior distribution. However, this method is hardly scalable for nonlinear classification problems with kernels. To address this problem, in this paper, we propose a novel scalable quadruply stochastic gradient algorithm (QSG-S2AUC) for nonlinear semi-supervised AUC optimization. In each iteration of the stochastic optimization process, our method randomly samples a positive instance, a negative instance, an unlabeled instance and their random features to compute the gradient and then update the model by using this quadruply stochastic gradient to approach the optimal solution. More importantly, we prove that QSG-S2AUC can converge to the optimal solution in O(1/t), where t is the iteration number. Extensive experimental results onĀ  a variety of benchmark datasets show that QSG-S2AUC is far more efficient than the existing state-of-the-art algorithms for semi-supervised AUC maximization, while retaining the similar generalization performance.
Keywords:
Machine Learning: Semi-Supervised Learning
Machine Learning: Kernel Methods
Machine Learning Applications: Big data ; Scalability