Large-scale Subspace Clustering by Fast Regression Coding

Large-scale Subspace Clustering by Fast Regression Coding

Jun Li, Handong Zhao, Zhiqiang Tao, Yun Fu

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 2138-2144. https://doi.org/10.24963/ijcai.2017/297

Large-Scale Subspace Clustering (LSSC) is an interesting and important problem in big data era. However, most existing methods (i.e., sparse or low-rank subspace clustering) cannot be directly used for solving LSSC because they suffer from the high time complexity-quadratic or cubic in n (the number of data points). To overcome this limitation, we propose a Fast Regression Coding (FRC) to optimize regression codes, and simultaneously train a non-linear function to approximate the codes. By using FRC, we develop an efficient Regression Coding Clustering (RCC) framework to solve the LSSC problem. It consists of sampling, FRC and clustering. RCC randomly samples a small number of data points, quickly calculates the codes of all data points by using the non-linear function learned from FRC, and employs a large-scale spectral clustering method to cluster the codes. Besides, we provide a theorem guarantee that the non-linear function has a first-order approximation ability and a group effect. The theorem manifests that the codes are easily used to construct a dividable similarity graph. Compared with the state-of-the-art LSSC methods, our model achieves better clustering results in large-scale datasets.
Keywords:
Machine Learning: Machine Learning
Robotics and Vision: Vision and Perception