Bipartite Matching for Crowd Counting with Point Supervision

Bipartite Matching for Crowd Counting with Point Supervision

Hao Liu, Qiang Zhao, Yike Ma, Feng Dai

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
Main Track. Pages 860-866. https://doi.org/10.24963/ijcai.2021/119

For crowd counting task, it has been demonstrated that imposing Gaussians to point annotations hurts generalization performance. Several methods attempt to utilize point annotations as supervision directly. And they have made significant improvement compared with density-map based methods. However, these point based methods ignore the inevitable annotation noises and still suffer from low robustness to noisy annotations. To address the problem, we propose a bipartite matching based method for crowd counting with only point supervision (BM-Count). In BM-Count, we select a subset of most similar pixels from the predicted density map to match annotated pixels via bipartite matching. Then loss functions can be defined based on the matching pairs to alleviate the bad effect caused by those annotated dots with incorrect positions. Under the noisy annotations, our method reduces MAE and RMSE by 9% and 11.2% respectively. Moreover, we propose a novel ranking distribution learning framework to address the imbalanced distribution problem of head counts, which encodes the head counts as classification distribution in the ranking domain and refines the estimated count map in the continuous domain. Extensive experiments on four datasets show that our method achieves state-of-the-art performance and performs better crowd localization.
Keywords:
Computer Vision: Perception
Computer Vision: Video: Events, Activities and Surveillance
Machine Learning: Deep Learning