Learning Task-aware Local Representations for Few-shot Learning

Learning Task-aware Local Representations for Few-shot Learning

Chuanqi Dong, Wenbin Li, Jing Huo, Zheng Gu, Yang Gao

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Main track. Pages 716-722. https://doi.org/10.24963/ijcai.2020/100

Few-shot learning for visual recognition aims to adapt to novel unseen classes with only a few images. Recent work, especially the work based on low-level information, has achieved great progress. In these work, local representations (LRs) are typically employed, because LRs are more consistent among the seen and unseen classes. However, most of them are limited to an individual image-to-image or image-to-class measure manner, which cannot fully exploit the capabilities of LRs, especially in the context of a certain task. This paper proposes an Adaptive Task-aware Local Representations Network (ATL-Net) to address this limitation by introducing episodic attention, which can adaptively select the important local patches among the entire task, as the process of human recognition. We achieve much superior results on multiple benchmarks. On the miniImagenet, ATL-Net gains 0.93% and 0.88% improvements over the compared methods under the 5-way 1-shot and 5-shot settings. Moreover, ATL-Net can naturally tackle the problem that how to adaptively identify and weight the importance of different key local parts, which is the major concern of fine-grained recognition. Specifically, on the fine-grained dataset Stanford Dogs, ATL-Net outperforms the second best method with 5.39% and 9.69% gains under the 5-way 1-shot and 5-shot settings.
Keywords:
Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation