Recovering Accurate Labeling Information from Partially Valid Data for Effective Multi-Label Learning

Recovering Accurate Labeling Information from Partially Valid Data for Effective Multi-Label Learning

Ximing Li, Yang Wang

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Main track. Pages 1373-1380. https://doi.org/10.24963/ijcai.2020/191

Partial Multi-label Learning (PML) aims to induce the multi-label predictor from datasets with noisy supervision, where each training instance is associated with several candidate labels but only partially valid. To address the noisy issue, the existing PML methods basically recover the ground-truth labels by leveraging the ground-truth confidence of the candidate label, i.e., the likelihood of a candidate label being a ground-truth one. However, they neglect the information from non-candidate labels, which potentially contributes to the ground-truth label recovery. In this paper, we propose to recover the ground-truth labels, i.e., estimating the ground-truth confidences, from the label enrichment, composed of the relevance degrees of candidate labels and irrelevance degrees of non-candidate labels. Upon this observation, we further develop a novel two-stage PML method, namely Partial Multi-Label Learning with Label Enrichment-Recovery (PML3ER), where in the first stage, it estimates the label enrichment with unconstrained label propagation, then jointly learns the ground-truth confidence and multi-label predictor given the label enrichment. Experimental results validate that PML3ER outperforms the state-of-the-art PML methods.
Keywords:
Data Mining: Classification, Semi-Supervised Learning