From Sparse to Complete: Semantic Understanding Based on Stroke Evolution in On-the-fly Sketch-based Image Retrieval
From Sparse to Complete: Semantic Understanding Based on Stroke Evolution in On-the-fly Sketch-based Image Retrieval
Yingge Liu, Dawei Dai, Xiangling Hou, Shilin Zhao, Guoyin Wang
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Main Track. Pages 1639-1647.
https://doi.org/10.24963/ijcai.2025/183
In contrast with human sketching, which pre-conceptualizes outlines and features, conventional sketch retrieval models rely primarily rely on pixel-level processing and feature extraction, limiting their ability to capture early sketch intent. Consequently, these models are susceptible to subjective stroke noise, reducing retrieval accuracy. To address this issue, we propose a novel on-the-fly noise stroke retrieval framework designed to align with human sketch-drawing cognition. The proposed framework introduces two core innovations. (i) A stroke consistency detection module that effectively discriminates and suppresses noise strokes by quantifying the structural similarity between the current stroke and the target image, as well as its alignment with key skeletal components. (ii) An adaptive gated mixture of experts module that dynamically selects and integrates features from multiple expert networks during the early, sparse stages of sketching, thereby capturing relevant information with greater precision. Experimental results across diverse sketch datasets demonstrate that the proposed method effectively identifies and suppresses early noise strokes, significantly enhances sketch retrieval performance, and exhibits strong robustness across varying sketch styles.
Keywords:
Computer Vision: CV: Image and video retrieval
Computer Vision: CV: Representation learning
Humans and AI: HAI: Human-computer interaction
