SpeechHGT: A Multimodal Hypergraph Transformer for Speech-Based Early Alzheimer’s Disease Detection

SpeechHGT: A Multimodal Hypergraph Transformer for Speech-Based Early Alzheimer’s Disease Detection

Shagufta Abid, Dongyu Zhang, Ahsan Shehzad, Jing Ren, Shuo Yu, Hongfei Lin, Feng Xia

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
AI4Tech: AI Enabling Technologies. Pages 9131-9139. https://doi.org/10.24963/ijcai.2025/1015

Early detection of Alzheimer's disease (AD) through spontaneous speech analysis represents a promising, non-invasive diagnostic approach. Existing methods predominantly rely on fusion-based multimodal deep learning, effectively integrating linguistic and acoustic features. However, these methods inadequately model higher-order interactions between modalities, reducing diagnostic accuracy. To address this, we introduce SpeechHGT, a multimodal hypergraph transformer designed to capture and learn higher-order interactions in spontaneous speech features. SpeechHGT encodes multimodal features as hypergraphs, where nodes represent individual features and hyperedges represent grouped interactions. A novel hypergraph attention mechanism enables robust modeling of both pairwise and higher-order interactions. Experimental evaluations on the DementiaBank datasets reveal that SpeechHGT achieves state-of-the-art performance, surpassing baseline models in accuracy and F1 score. These results highlight the potential of hypergraph-based models to improve AI-driven diagnostic tools for early AD detection.
Keywords:
Advanced AI4Tech: Multimodal AI4Tech
Advanced AI4Tech: Data-driven AI4Tech
Domain-specific AI4Tech: AI4Care and AI4Health