The Sparse MinMax k-Means Algorithm for High-Dimensional Clustering

The Sparse MinMax k-Means Algorithm for High-Dimensional Clustering

Sayak Dey, Swagatam Das, Rammohan Mallipeddi

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Main track. Pages 2103-2110. https://doi.org/10.24963/ijcai.2020/291

Classical clustering methods usually face tough challenges when we have a larger set of features compared to the number of items to be partitioned. We propose a Sparse MinMax k-Means Clustering approach by reformulating the objective of the MinMax k-Means algorithm (a variation of classical k-Means that minimizes the maximum intra-cluster variance instead of the sum of intra-cluster variances), into a new weighted between-cluster sum of squares (BCSS) form. We impose sparse regularization on these weights to make it suitable for high-dimensional clustering. We seek to use the advantages of the MinMax k-Means algorithm in the high-dimensional space to generate good quality clusters. The efficacy of the proposal is showcased through comparison against a few representative clustering methods over several real world datasets.
Keywords:
Machine Learning: Clustering
Machine Learning: Feature Selection; Learning Sparse Models
Data Mining: Clustering, Unsupervised Learning