Bayesian Decision Process for Budget-efficient Crowdsourced Clustering
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Main track. Pages 2044-2050. https://doi.org/10.24963/ijcai.2020/283
The performance of clustering depends on an appropriately defined similarity between two items. When the similarity is measured based on human perception, human workers are often employed to estimate a similarity score between items in order to support clustering, leading to a procedure called crowdsourced clustering. Assuming a monetary reward is paid to a worker for each similarity score and assuming the similarities between pairs and workers' reliability have a large diversity, when the budget is limited, it is critical to wisely assign pairs of items to different workers to optimize the clustering result. We model this budget allocation problem as a Markov decision process where item pairs are dynamically assigned to workers based on the historical similarity scores they provided. We propose an optimistic knowledge gradient policy where the assignment of items in each stage is based on the minimum-weight K-cut defined on a similarity graph. We provide simulation studies and real data analysis to demonstrate the performance of the proposed method.
Machine Learning: Unsupervised Learning
Machine Learning Applications: Applications of Unsupervised Learning