Extracting Privileged Information from Untagged Corpora for Classifier Learning

Extracting Privileged Information from Untagged Corpora for Classifier Learning

Yazhou Yao, Jian Zhang, Fumin Shen, Wankou Yang, Xian-Sheng Hua, Zhenmin Tang

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 1085-1091. https://doi.org/10.24963/ijcai.2018/151

The performance of data-driven learning approaches is often unsatisfactory when the training data is inadequate either in quantity or quality. Manually labeled privileged information (PI), \eg attributes, tags or properties, is usually incorporated to improve classifier learning. However, the process of manually labeling is time-consuming and labor-intensive. To address this issue, we propose to enhance classifier learning by extracting PI from untagged corpora, which can effectively eliminate the dependency on manually labeled data. In detail, we treat each selected PI as a subcategory and learn one classifier for per subcategory independently. The classifiers for all subcategories are then integrated together to form a more powerful category classifier. Particularly, we propose a new instance-level multi-instance learning (MIL) model to simultaneously select a subset of training images from each subcategory and learn the optimal classifiers based on the selected images. Extensive experiments demonstrate the superiority of our approach.
Keywords:
Computer Vision: Language and Vision
Humans and AI: Cognitive Systems