Dite-HRNet: Dynamic Lightweight High-Resolution Network for Human Pose Estimation

Qun Li; Ziyi Zhang; Fu Xiao; Feng Zhang; Bir Bhanu

doi:10.24963/ijcai.2022/153

Dite-HRNet: Dynamic Lightweight High-Resolution Network for Human Pose Estimation

Qun Li, Ziyi Zhang, Fu Xiao, Feng Zhang, Bir Bhanu

Watch video

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence

Main Track. Pages 1095-1101. https://doi.org/10.24963/ijcai.2022/153

PDF BibTeX

A high-resolution network exhibits remarkable capability in extracting multi-scale features for human pose estimation, but fails to capture long-range interactions between joints and has high computational complexity. To address these problems, we present a Dynamic lightweight High-Resolution Network (Dite-HRNet), which can efficiently extract multi-scale contextual information and model long-range spatial dependency for human pose estimation. Specifically, we propose two methods, dynamic split convolution and adaptive context modeling, and embed them into two novel lightweight blocks, which are named dynamic multi-scale context block and dynamic global context block. These two blocks, as the basic component units of our Dite-HRNet, are specially designed for the high-resolution networks to make full use of the parallel multi-resolution architecture. Experimental results show that the proposed network achieves superior performance on both COCO and MPII human pose estimation datasets, surpassing the state-of-the-art lightweight networks. Code is available at: https://github.com/ZiyiZhang27/Dite-HRNet.

Keywords:

Computer Vision: Biometrics, Face, Gesture and Pose Recognition

Computer Vision: Action and Behaviour Recognition

Machine Learning: Convolutional Networks