Self-Supervised Tuning for Few-Shot Segmentation
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Main track. Pages 1019-1025. https://doi.org/10.24963/ijcai.2020/142
Few-shot segmentation aims at assigning a category label to each image pixel with few annotated samples. It is a challenging task since the dense prediction can only be achieved under the guidance of latent features defined by sparse annotations. Existing meta-learning based method tends to fail in generating category-specifically discriminative descriptor when the visual features extracted from support images are marginalized in embedding space. To address this issue, this paper presents an adaptive tuning framework, in which the distribution of latent features across different episodes is dynamically adjusted based on a self-segmentation scheme, augmenting category-specific descriptors for label prediction. Specifically, a novel self-supervised inner-loop is firstly devised as the base learner to extract the underlying semantic features from the support image. Then, gradient maps are calculated by back-propagating self-supervised loss through the obtained features, and leveraged as guidance for augmenting the corresponding elements in the embedding space. Finally, with the ability to continuously learn from different episodes, an optimization-based meta-learner is adopted as outer loop of our proposed framework to gradually refine the segmentation results. Extensive experiments on benchmark PASCAL-5i and COCO-20i datasets demonstrate the superiority of our proposed method over state-of-the-art.
Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation
Machine Learning: Deep Learning
Machine Learning: Other