Omni-Dimensional State Space Model-driven SAM for Pixel-level Anomaly Detection

Omni-Dimensional State Space Model-driven SAM for Pixel-level Anomaly Detection

Chao Huang, Qianyi Li, Jie Wen, Bob Zhang

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Main Track. Pages 1152-1160. https://doi.org/10.24963/ijcai.2025/129

Pixel-level anomaly detection is indispensable in industrial defect detection and medical diagnosis. Recently, Segment Anything Model (SAM) has achieved promising results in many vision tasks. However, direct application of the SAM to pixel-level anomaly detection tasks results in unsatisfactory performance, meanwhile SAM needs the manual prompt. Although some automatically prompt-based SAM has been proposed, these automated prompting approaches merely utilize partial image features as prompts and fail to incorporate crucial features such as multi-scale image features to generate more suitable prompts. In this paper, we propose a novel Omni Dimensional State Space Model-driven SAM (ODS-SAM) for pixel-level anomaly detection. Specifically, the proposed method adopts the SAM architecture, ensuring easy implementation and avoiding the need for fine-tuning. A State-Space Model-based residual Omni Dimensional module is designed to automatically generate suitable prompts. This module can effectively leverage multi-scale and global information, facilitating an iterative search for optimal prompts in the prompt space. The identified optimal prompts are then fed into SAM as high-dimensional tensors. Experimental results demonstrate that the proposed ODS-SAM outperforms state-of-the-art models on both industrial and medical image datasets.
Keywords:
Computer Vision: CV: Recognition (object detection, categorization)
Computer Vision: CV: Representation learning
Machine Learning: ML: Feature extraction, selection and dimensionality reduction
Machine Learning: ML: Multi-view learning