Asynchronous Active Learning with Distributed Label Querying

Asynchronous Active Learning with Distributed Label Querying

Sheng-Jun Huang, Chen-Chen Zong, Kun-Peng Ning, Hai-Bo Ye

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
Main Track. Pages 2570-2576. https://doi.org/10.24963/ijcai.2021/354

Active learning tries to learn an effective model with lowest labeling cost. Most existing active learning methods work in a synchronous way, which implies that the label querying can be performed only after the model updating in each iteration. While training models is usually time-consuming, it may lead to serious latency between two queries, especially in the crowdsourcing environments where there are many online annotators working simultaneously. This will significantly decrease the labeling efficiency and strongly limit the application of active learning in real tasks. To overcome this challenge, we propose a multi-server multi-worker framework for asynchronous active learning in the distributed environment. By maintaining two shared pools of candidate queries and labeled data respectively, the servers, the workers and the annotators efficiently corporate with each other without synchronization. Moreover, diverse sampling strategies from distributed workers are incorporated to select the most useful instances for model improving. Both theoretical analysis and experimental study validate the effectiveness of the proposed approach.
Keywords:
Machine Learning: Active Learning
Machine Learning: Weakly Supervised Learning
Machine Learning: Semi-Supervised Learning