Learning Graph-based Residual Aggregation Network for Group Activity Recognition

Learning Graph-based Residual Aggregation Network for Group Activity Recognition

Wei Li, Tianzhao Yang, Xiao Wu, Zhaoquan Yuan

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
Main Track. Pages 1102-1108. https://doi.org/10.24963/ijcai.2022/154

Group activity recognition aims to understand the overall behavior performed by a group of people. Recently, some graph-based methods have made progress by learning the relation graphs among multiple persons. However, the differences between an individual and others play an important role in identifying confusable group activities, which have not been elaborately explored by previous methods. In this paper, a novel Graph-based Residual AggregatIon Network (GRAIN) is proposed to model the differences among all persons of the whole group, which is end-to-end trainable. Specifically, a new local residual relation module is explicitly proposed to capture the local spatiotemporal differences of relevant persons, which is further combined with the multi-graph relation networks. Moreover, a weighted aggregation strategy is devised to adaptively select multi-level spatiotemporal features from the appearance-level information to high level relations. Finally, our model is capable of extracting a comprehensive representation and inferring the group activity in an end-to-end manner. The experimental results on two popular benchmarks for group activity recognition clearly demonstrate the superior performance of our method in comparison with the state-of-the-art methods.
Keywords:
Computer Vision: Action and Behaviour Recognition
Computer Vision: Video analysis and understanding