Saliency Guided End-to-End Learning for Weakly Supervised Object Detection

Saliency Guided End-to-End Learning for Weakly Supervised Object Detection

Baisheng Lai, Xiaojin Gong

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 2053-2059. https://doi.org/10.24963/ijcai.2017/285

Weakly supervised object detection (WSOD), which is the problem of learning detectors using only image-level labels, has been attracting more and more interest. However, this problem is quite challenging due to the lack of location supervision. To address this issue, this paper integrates saliency into a deep architecture, in which the location information is explored both explicitly and implicitly. Specifically, we select highly confident object proposals under the guidance of class-specific saliency maps. The location information, together with semantic and saliency information, of the select proposals are then used to explicitly supervise the network by imposing two additional losses. Meanwhile, a saliency prediction sub-network is built in the architecture. The prediction results are used to implicitly guide the localization procedure. The entire network is trained end-to-end. Experiments on PASCAL VOC demonstrate that our approach outperforms all state-of-the-arts.
Keywords:
Machine Learning: Multi-instance/Multi-label/Multi-view learning
Machine Learning: Deep Learning
Robotics and Vision: Vision and Perception