WiCo: Win-win Cooperation of Bottom-up and Top-down Referring Image Segmentation

WiCo: Win-win Cooperation of Bottom-up and Top-down Referring Image Segmentation

Zesen Cheng, Peng Jin, Hao Li, Kehan Li, Siheng Li, Xiangyang Ji, Chang Liu, Jie Chen

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 636-644. https://doi.org/10.24963/ijcai.2023/71

The top-down and bottom-up methods are two mainstreams of referring segmentation, while both methods have their own intrinsic weaknesses. Top-down methods are chiefly disturbed by Polar Negative (PN) errors owing to the lack of fine-grained cross-modal alignment. Bottom-up methods are mainly perturbed by Inferior Positive (IP) errors due to the lack of prior object information. Nevertheless, we discover that two types of methods are highly complementary for restraining respective weaknesses but the direct average combination leads to harmful interference. In this context, we build Win-win Cooperation (WiCo) to exploit complementary nature of two types of methods on both interaction and integration aspects for achieving a win-win improvement. For the interaction aspect, Complementary Feature Interaction (CFI) introduces prior object information to bottom-up branch and provides fine-grained information to top-down branch for complementary feature enhancement. For the integration aspect, Gaussian Scoring Integration (GSI) models the gaussian performance distributions of two branches and weighted integrates results by sampling confident scores from the distributions. With our WiCo, several prominent bottom-up and top-down combinations achieve remarkable improvements on three common datasets with reasonable extra costs, which justifies effectiveness and generality of our method.
Keywords:
Computer Vision: CV: Vision and languageĀ 
Computer Vision: CV: Segmentation