Learnable Frequency Decomposition for Image Forgery Detection and Localization
Learnable Frequency Decomposition for Image Forgery Detection and Localization
Dong Li, Jiayíng Zhu, Yidi Liu, Xin Lu, Xueyang Fu, Jiawei Liu, Aiping Liu, Zheng-Jun Zha
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Main Track. Pages 1359-1367.
https://doi.org/10.24963/ijcai.2025/152
Concern for image authenticity spurs research in image forgery detection and localization (IFDL). Most deep learning-based methods focus primarily on spatial domain modeling and have not fully explored frequency domain strategies. In this paper, we observe and analyze the frequency characteristic changes caused by image tampering. Observations indicate that manipulation traces are especially prominent in phase components and span both low and high-frequency bands. Based on these findings, we propose a forensic frequency decomposition network (F2D-Net), which incorporates deep Fourier transforms and leverages both phase information and high and low-frequency components to enhance IFDL. Specifically, F2D-Net consists of the Spectral Decomposition Subnetwork (SDSN) and the Frequency Separation Subnetwork (FSSN). The former decomposes the image into amplitude and phase, focusing on learning the semantic content in the phase spectrum to identify forged objects, thus improving forgery detection accuracy. The latter further adaptively decomposes the output of the SDSN to obtain corresponding high and low frequencies, and applies a divide-and-conquer strategy to refine each frequency band, mitigating the optimization difficulties caused by coupled forgery traces across different frequencies, thereby better capturing the pixels belonging to the forged object to improve localization accuracy. Experiments on multiple datasets demonstrate that our method outperforms state-of-the-art image forgery detection and localization techniques both qualitatively and quantitatively.
Keywords:
Computer Vision: CV: Low-level Vision
Computer Vision: CV: Segmentation, grouping and shape analysis
