Themis: A Fair Evaluation Platform for Computer Vision Competitions

Themis: A Fair Evaluation Platform for Computer Vision Competitions

Zinuo Cai, Jianyong Yuan, Yang Hua, Tao Song, Hao Wang, Zhengui Xue, Ningxin Hu, Jonathan Ding, Ruhui Ma, Mohammad Reza Haghighat, Haibing Guan

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
Main Track. Pages 599-605. https://doi.org/10.24963/ijcai.2021/83

It has become increasingly thorny for computer vision competitions to preserve fairness when participants intentionally fine-tune their models against the test datasets to improve their performance. To mitigate such unfairness, competition organizers restrict the training and evaluation process of participants' models. However, such restrictions introduce massive computation overheads for organizers and potential intellectual property leakage for participants. Thus, we propose Themis, a framework that trains a noise generator jointly with organizers and participants to prevent intentional fine-tuning by protecting test datasets from surreptitious manual labeling. Specifically, with the carefully designed noise generator, Themis adds noise to perturb test sets without twisting the performance ranking of participants' models. We evaluate the validity of Themis with a wide spectrum of real-world models and datasets. Our experimental results show that Themis effectively enforces competition fairness by precluding manual labeling of test sets and preserving the performance ranking of participants' models.
Keywords:
Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation
AI Ethics, Trust, Fairness: Fairness
Machine Learning Applications: Applications of Supervised Learning