Partial Label Clustering

Partial Label Clustering

Yutong Xie, Fuchao Yang, Yuheng Jia

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Main Track. Pages 6678-6686. https://doi.org/10.24963/ijcai.2025/743

Partial label learning (PLL) is a significant weakly supervised learning framework, where each training example corresponds to a set of candidate labels and only one label is the ground-truth label. For the first time, this paper investigates the partial label clustering problem, which takes advantage of the limited available partial labels to improve the clustering performance. Specifically, we first construct a weight matrix of examples based on their relationships in the feature space and disambiguate the candidate labels to estimate the ground-truth label based on the weight matrix. Then, we construct a set of must-link and cannot-link constraints based on the disambiguation results. Moreover, we propagate the initial must-link and cannot-link constraints based on an adversarial prior promoted dual-graph learning approach. Finally, we integrate weight matrix construction, label disambiguation, and pairwise constraints propagation into a joint model to achieve mutual enhancement. We also theoretically prove that a better disambiguated label matrix can help improve clustering performance. Comprehensive experiments demonstrate our method realizes superior performance when comparing with state-of-the-art constrained clustering methods, and outperforms PLL and semi-supervised PLL methods when only limited samples are annotated. The code and appendix are publicly available at https://github.com/xyt-ml/PLC.
Keywords:
Machine Learning: ML: Weakly supervised learning
Machine Learning: ML: Multi-label learning
Machine Learning: ML: Self-supervised Learning
Machine Learning: ML: Unsupervised learning