GestureDet: Real-time Student Gesture Analysis with Multi-dimensional Attention-based Detector

GestureDet: Real-time Student Gesture Analysis with Multi-dimensional Attention-based Detector

Rui Zheng, Fei Jiang, Ruimin Shen

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Main track. Pages 680-686. https://doi.org/10.24963/ijcai.2020/95

Students’ gestures, hand-raising, stand-up, and sleeping, indicates the engagement of students in classrooms and partially reflects teaching quality. Therefore, fast and automatically recognizing these gestures are of great importance. Due to limited computational resources in primary and secondary schools, we propose a real-time student behavior detector based on light-weight MobileNetV2-SSD to reduce the dependency of GPUs. Firstly, we build a large-scale corpus from real schools to capture various behavior gestures. Based on such a corpus, we transfer the gesture recognition task into object detections. Secondly, we design a multi-dimensional attention-based detector, named GestureDet, for real-time and accurate gesture analysis. The multi-dimensional attention mechanisms simultaneously consider all the dimensions of the training set, aiming to pay more attention to discriminative features and samples that are important for the final performance. Specifically, the spatial attention is constructed with stacked dilated convolution layers to generate a soft and learnable mask for re-weighting foreground and background features; the channel attention introduces the context modeling and squeeze-and-excitation module to focus on discriminative features; the batch attention discriminates important samples with a new designed reweight strategy. Experimental results demonstrate the effectiveness and versatility of GestureDet, which achieves 75.2% mAP on real student behavior dataset, and 74.5% on public PASCAL VOC dataset at 20fps on embedding device Nvidia Jetson TX2. Code will be made publicly available.
Keywords:
Computer Vision: Biometrics, Face and Gesture Recognition
Multidisciplinary Topics and Applications: Real-Time Systems
Humans and AI: Computer-Aided Education