FLS: A New Local Search Algorithm for K-means with Smaller Search Space

FLS: A New Local Search Algorithm for K-means with Smaller Search Space

Junyu Huang, Qilong Feng, Ziyun Huang, Jinhui Xu, Jianxin Wang

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
Main Track. Pages 3092-3098. https://doi.org/10.24963/ijcai.2022/429

The k-means problem is an extensively studied unsupervised learning problem with various applications in decision making and data mining. In this paper, we propose a fast and practical local search algorithm for the k-means problem. Our method reduces the search space of swap pairs from O(nk) to O(k^2), and applies random mutations to find potentially better solutions when local search falls into poor local optimum. With the assumption of data distribution that each optimal cluster has "average" size of \Omega(n/k), which is common in many datasets and k-means benchmarks, we prove that our proposed algorithm gives a (100+\epsilon)-approximate solution in expectation. Empirical experiments show that our algorithm achieves better performance compared to existing state-of-the-art local search methods on k-means benchmarks and large datasets.
Keywords:
Machine Learning: Clustering
Search: Applications
Search: Combinatorial Search and Optimisation
Search: Heuristic Search
Search: Local search