Balanced Clustering: A Uniform Model and Fast Algorithm

Balanced Clustering: A Uniform Model and Fast Algorithm

Weibo Lin, Zhu He, Mingyu Xiao

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 2987-2993. https://doi.org/10.24963/ijcai.2019/414

Clustering is a fundamental research topic in data mining and machine learning. In addition, many specific applications demand that the clusters obtained be balanced. In this paper, we present a balanced clustering model that is to minimize the sum of squared distances to cluster centers, with uniform regularization functions to control the balance degree of the clustering results. To solve the model, we adopt the idea of the k-means method. We show that the k-means assignment step has an equivalent minimum cost flow formulation when the regularization functions are all convex. By using a novel and simple acceleration technique for the k-means and network simplex methods our model can be solved quite efficiently. Experimental results over benchmarks validate the advantage of our algorithm compared to the state-of-the-art balanced clustering algorithms. On most datasets, our algorithm runs more than 100 times faster than previous algorithms with a better solution.
Keywords:
Machine Learning: Clustering
Constraints and SAT: Modeling;Formulation
Constraints and SAT: Constraints and Data Mining ; Machine Learning
Heuristic Search and Game Playing: Combinatorial Search and Optimisation