Sampling with Minimum Sum of Squared Similarities for Nystrom-Based Large Scale Spectral Clustering / 2313
Djallel Bouneffouf, Inanc Birol
The Nystrom method provides an efficient sampling approach for large scale clustering problems, by generating a low-rank matrix approximation. However, existing sampling methods are limited by accuracy and computing time. This paper proposes an improved Nystrom-based clustering algorithm with a new sampling procedure, Minimum Sum of Squared Similarities (MSSS). Experiments on synthetic and real data sets show that the proposed sampling performs with higher accuracy than existing algorithms, applied to Nystrom-based spectral clustering problems. Furthermore, we provide a theoretical analysis that allows us to define the upper bound of the Frobenius norm error of the MSSS.