Storage Fit Learning with Unlabeled Data

Storage Fit Learning with Unlabeled Data

Bo-Jian Hou, Lijun Zhang, Zhi-Hua Zhou

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 1844-1850. https://doi.org/10.24963/ijcai.2017/256

By using abundant unlabeled data, semi-supervised learning approaches have been found very useful in various tasks. Existing approaches, however, neglect the fact that the storage available for the learning process is different under different situations, and thus, the learning approaches should be flexible subject to the storage budget limit. In this paper, we focus on graph-based semi-supervised learning and propose two storage fit learning approaches which can adjust their behaviors to different storage budgets. Specifically, we utilize techniques of low-rank matrix approximation to find a low-rank approximator of the similarity matrix so as to reduce the space complexity. The first approach is based on stochastic optimization, which is an iterative approach that converges to the optimal low-rank approximator globally. The second approach is based on Nystrom method, which can find a good low-rank approximator efficiently and is suitable for real-time applications. Experiments on classification tasks show that the proposed methods can fit dynamically different storage budgets and obtain good performances in different scenarios.
Keywords:
Machine Learning: Cost-Sensitive Learning
Machine Learning: Semi-Supervised Learning