Projection, Interaction and Fusion: A Progressive Difference Fusion Network for Salient Object Detection

Xiao Ke; Weijie Zhou; Yuzhen Niu

doi:10.24963/ijcai.2025/145

Projection, Interaction and Fusion: A Progressive Difference Fusion Network for Salient Object Detection

Xiao Ke, Weijie Zhou, Yuzhen Niu

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence

Main Track. Pages 1296-1304. https://doi.org/10.24963/ijcai.2025/145

PDF BibTeX

In recent years, deep learning-based Salient Object Detection (SOD) methods have made tremendous progress; however, their performance in complex scenarios has reached a bottleneck. In this paper, we propose a novel Progressive Difference Fusion Network (PDFNet) based on fine-grained feature fusion. First, to address the scale variability of salient objects, we introduce a Self-Guided Module (SGM) with dynamic receptive fields. Second, to tackle the shape variability of salient objects, we design a Feature Aggregation Module (FAM) incorporating cross convolutions and a feedback loop. Finally, to alleviate the issue of confusion between global and detail information during multi-scale feature fusion in existing models, we develop a Progressive Difference Fusion Unit (PDFU) to project multi-scale features into fine-grained nodes and enhance them through node interaction based on difference features. Additionally, we propose a Conditional Random Field Based on Patch (CRFbp), which focuses on handling discrete points, further improving the model’s performance. Extensive experiments demonstrate that our method achieves state-of-the-art (SOTA) performance on five benchmark datasets. Code is available at: https://github.com/pdfnet2025/PDFNet.git.

Keywords:

Computer Vision: CV: Low-level Vision

Computer Vision: CV: Recognition (object detection, categorization)