Balancing Invariant and Specific Knowledge for Domain Generalization with Online Knowledge Distillation

Balancing Invariant and Specific Knowledge for Domain Generalization with Online Knowledge Distillation

Di Zhao, Jingfeng Zhang, Hongsheng Hu, Philippe Fournier-Viger, Gillian Dobbie, Yun Sing Koh

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Main Track. Pages 2440-2448. https://doi.org/10.24963/ijcai.2025/272

Recent research has demonstrated the effectiveness of knowledge distillation in Domain Generalization. However, existing approaches often overlook domain-specific knowledge and rely on an offline distillation strategy, limiting the effectiveness of knowledge transfer. To address these limitations, we propose Balanced Online knowLedge Distillation (BOLD). BOLD leverages a multi-domain expert teacher model, with each expert specializing in a specific source domain, enabling the student to distill both domain-invariant and domain-specific knowledge. We incorporate the Pareto optimization principle and uncertainty weighting to balance these two types of knowledge, ensuring simultaneous optimization without compromising either. Additionally, BOLD employs an online knowledge distillation strategy, allowing the teacher and student to learn concurrently. This dynamic interaction enables the teacher to adapt based on student feedback, facilitating more effective knowledge transfer. Extensive experiments on seven benchmarks demonstrate that BOLD outperforms state-of-the-art methods. Furthermore, we provide theoretical insights that highlight the importance of domain-specific knowledge and the advantages of uncertainty weighting.
Keywords:
Computer Vision: CV: Transfer, low-shot, semi- and un- supervised learning   
Machine Learning: ML: Foundation models