Transductive Optimization of Top k Precision / 1781
Li-Ping Liu, Thomas G. Dietterich, Nan Li, Zhi-Hua Zhou
Consider a binary classification problem in which the learner is given a labeled training set, an unlabeled test set, and is restricted to choosing exactly k test points to output as positive predictions. Problems of this kind — transductive precision@k — arise in many applications. Previous methods solve these problems in two separate steps, learning the model and selecting k test instances by thresholding their scores. In this way, model training is not aware of the constraint of choosing k test instances as positive in the test phase. This paper shows the importance of incorporating the knowledge of k into the learning process and introduces a new approach, Transductive Top K (TTK), that seeks to minimize the hinge loss over all training instances under the constraint that exactly k test instances are predicted as positive. The paper presents two optimization methods for this challenging problem. Experiments and analysis confirm the benefit of incoporating k in the learning process. In our experimental evaluations, the performance of TTK matches or exceeds existing state-of-the-art methods on 7 benchmark datasets for binary classification and 3 reserve design problem instances.