Optimistic Active Learning using Mutual Information
Yuhong Guo, Russ Greiner
An active learning system will sequentially decide which unlabeled instance to label, with the goal of efficiently gathering the information necessary to produce a good classifier. Some such systems greedily select the next instance based only on properties of that instance and the few currently labeled points--- e.g., selecting the one closest to the current classification boundary. Unfortunately, these approaches ignore the valuable information contained in the other unlabeled instances, which can help identify a good classifier much faster. For the previous approaches that do exploit this unlabeled data, this information is mostly used in a conservative way. One common property of the approaches in the literature is that the active learner sticks to one single query selection criterion in the whole process. We propose a system, MM+M, that selects the query instance that is able to provide the maximum conditional mutual information about the labels of the unlabeled instances, given the labeled data, in an optimistic way. This approach implicitly exploits the discriminative partition information contained in the unlabeled data. Instead of using one selection criterion, MM+M also employs a simple on-line method that changes its selection rule when it encounters an unexpected label. Our empirical results demonstrate that this new approach works effectively.