Fusion of Granular-Ball Visual Spatial Representations for Enhanced Facial Expression Recognition

Fusion of Granular-Ball Visual Spatial Representations for Enhanced Facial Expression Recognition

Shuaiyu Liu, Qiyao Shen, Yunxi Wang, Yazhou Ren, Guoyin Wang

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Main Track. Pages 1594-1602. https://doi.org/10.24963/ijcai.2025/178

Facial Expression Recognition (FER) is a fundamental problem in computer vision. Despite recent advances, significant challenges remain. Current methods primarily focus on extracting visual representations while overlooking other valuable information. To address this limitation, we propose a novel method called Component Separation and Granular-ball Space Bootstrap Fusion (CS-GBSBF), which leverages granular balls to transform visual images to spatial graphs, thereby enlarging the spatial information embedded in images. Our method separates the face into different components and utilizes the spatial information to bootstrap the fusion. More specifically, CS-GBSBF mainly consists of three crucial networks: Represent Extraction Network (REN), Represent Separation Network (RSN) and Represent Fusion Network (RFN). First, granular balls are used to represent expression images as graphs, which are fed into REN along with images. Then, RSN separates basic visual/spatial representations extracted from REN into a set of component visual/spatial representations. Next, RFN utilizes spatial representations to bootstrap component visual integration. A significant challenge in two-stream models is feature alignment, for which we have developed Attention Guidance Module (AGM) and Bootstrap Alignment Loss (L_BA) in REN and RFN, respectively. Results of experiment on eight databases show that CS-GBSBF consistently achieves higher recognition accuracy than several state-of-the-art methods. The code is available at https://github.com/Lsy235/CS-GBSBF.
Keywords:
Computer Vision: CV: Biometrics, face, gesture and pose recognition
Computer Vision: CV: Recognition (object detection, categorization)
Computer Vision: CV: Representation learning