Towards Debiased Generalized Category Discovery

Pengcheng Guo; Yonghong Song; Boyu Wang

doi:10.24963/ijcai.2025/587

Towards Debiased Generalized Category Discovery

Pengcheng Guo, Yonghong Song, Boyu Wang

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence

Main Track. Pages 5271-5279. https://doi.org/10.24963/ijcai.2025/587

PDF BibTeX

Generalized Category Discovery (GCD) aims at classifying unlabeled training data coming from old and novel classes by leveraging the information of partially labeled old classes. In this paper, we reveal that existing methods often suffer from competition between new and old classes, where the focus on learning new classes often results in a notable performance degradation on the old classes. Moreover, we delve into the reason behind this problem: the GCD classifier can be overconfident and biased towards the new class. With this insight, we propose Debiased GCD (DeGCD), a simple but effective approach that mitigates the bias caused by the overconfidence from new categories by a debiased head. Specifically, we first propose semantic calibration loss that aids the GCD classifier in debiasing by enforcing neighborhood prediction consistency with the latent representation of the debiased head. Furthermore, a debiased contrastive objective is proposed to refine the similarity matrix from the GCD classifier and the debiased classifier, suppressing the overconfidence in new classes in unlabeled data. In addition, an alignment constraint loss is designed to prevent damaging the distribution of the old categories caused by overconfidence in the new categories. Experiments on various datasets shows DeGCD achieves state-of-the-art performance and maintains a good balance between new and old classes. In addition, this method can be seamlessly adapted to other GCD methods, not only to achieve further performance gains but also to effectively balance the performance of the new class with that of the old class.

Keywords:

Machine Learning: ML: Semi-supervised learning

Machine Learning: ML: Clustering

Machine Learning: ML: Self-supervised Learning