Training-free Fourier Phase Diffusion for Style Transfer

Siyuan Zhang; Wei Ma; Libin Liu; Zheng Li; Hongbin Zha

doi:10.24963/ijcai.2025/266

Training-free Fourier Phase Diffusion for Style Transfer

Siyuan Zhang, Wei Ma, Libin Liu, Zheng Li, Hongbin Zha

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence

Main Track. Pages 2386-2394. https://doi.org/10.24963/ijcai.2025/266

PDF BibTeX

Diffusion models have shown significant potential for image style transfer tasks. However, achieving effective stylization while preserving content in a training-free setting remains a challenging issue due to the tightly coupled representation space and inherent randomness of the models. In this paper, we propose a Fourier phase diffusion model that addresses this challenge. Given that the Fourier phase spectrum encodes an image's edge structures, we propose modulating the intermediate diffusion samples with the Fourier phase of a content image to conditionally guide the diffusion process. This ensures content retention while fully utilizing the diffusion model's style generation capabilities. To implement this, we introduce a content phase spectrum incorporation method that aligns with the characteristics of the diffusion process, preventing interference with generative stylization. To further enhance content preservation, we integrate homomorphic semantic features extracted from the content image at each diffusion stage. Extensive experimental results demonstrate that our method outperforms state-of-the-art models in both content preservation and stylization. Code is available at https://github.com/zhang2002forwin/Fourier-Phase-Diffusion-for-Style-Transfer.

Keywords:

Computer Vision: CV: Image and video synthesis and generation

Computer Vision: CV: Applications and Systems