Active Learning from Crowds with Unsure Option / 1061
Jinhong Zhong, Ke Tang, Zhi-Hua Zhou
Learning from crowds, where the labels of data instances are collected using a crowdsourcing way, has attracted much attention during the past few years. In contrast to a typical crowdsourcing setting where all data instances are assigned to annotators for labeling, active learning from crowds actively selects a subset of data instances and assigns them to the annotators, thereby reducing the cost of labeling. This paper goes a step further. Rather than assume all annotators must provide labels, we allow the annotators to express that they are unsure about the assigned data instances. By adding the “unsure” option, the workloads for the annotators are somewhat reduced, because saying “unsure” will be easier than trying to provide a crisp label for some difficult data instances. Moreover, it is safer to use “unsure” feedback than to use labels from reluctant annotators because the latter has more chance to be misleading. Furthermore, different annotators may experience difficulty in different data instances, and thus the unsure option provides a valuable ingredient for modeling crowds’ expertise. We propose the ALCU-SVM algorithm for this new learning problem. Experimental studies on simulated and real crowdsourcing data show that, by exploiting the unsure option, ALCU-SVM achieves very promising performance.