Unleashing the Semantic Adaptability of Controlled Diffusion Model for Image Colorization

Xiangcheng Du; Zhao Zhou; Yanlong Wang; Yingbin Zheng; Xingjiao Wu; Peizhu Gong; Cheng Jin

doi:10.24963/ijcai.2025/106

Unleashing the Semantic Adaptability of Controlled Diffusion Model for Image Colorization

Xiangcheng Du, Zhao Zhou, Yanlong Wang, Yingbin Zheng, Xingjiao Wu, Peizhu Gong, Cheng Jin

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence

Main Track. Pages 945-953. https://doi.org/10.24963/ijcai.2025/106

PDF BibTeX

Recent data-driven image colorization methods have leveraged pre-trained Text-to-Image (T2I) diffusion models as generative prior, while still suffering from unsatisfactory and inaccurate semantic-level color control. To address these issues, we propose a Semantic Adaptation method (SeAda) that enhances the prior while considering the semantic discrepancy between color and grayscale image pairs. The SeAda employs a semantic adapter to produce refined semantic embeddings and a controlled T2I diffusion model to create reasonably colored images. Specifically, the semantic adapter transfers the embedding from grayscale to color domain, while the diffusion model utilizes the refined embedding and prior knowledge to achieve realistic and diverse results. We also design a three-staged training strategy to improve semantic comprehension and prior integration for further performance improvement. Extensive experiments on public datasets demonstrate that our method outperforms existing state-of-the-art techniques, yielding superior performance in image colorization.

Keywords:

Computer Vision: CV: Low-level Vision

Computer Vision: CV: Applications and Systems