RRG-Mamba: Efficient Radiology Report Generation with State Space Model

Xiaodi Hou; Xiaobo Li; Mingyu Lu; Simiao Wang; Yijia Zhang

doi:10.24963/ijcai.2025/824

RRG-Mamba: Efficient Radiology Report Generation with State Space Model

Xiaodi Hou, Xiaobo Li, Mingyu Lu, Simiao Wang, Yijia Zhang

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence

Main Track. Pages 7410-7418. https://doi.org/10.24963/ijcai.2025/824

PDF BibTeX

Recent advancements in radiology report generation have utilized deep neural networks such as CNNs and Transformers, achieving notable improvements in generating accurate and detailed reports. However, their practical adoption is hindered by the challenge of balancing global dependency modeling with computational efficiency. The state space model, particularly its enhanced variant Mamba, offers promising linear-complexity solutions for long-range dependency modeling. Despite its strengths, Mamba’s fixed positional encoding limits its ability to effectively capture complex spatial dependencies. To address this gap, we propose RRG-Mamba, an advanced framework for efficient radiology report generation. Within the RRGMamba, we enhance the vanilla Mamba by integrating rotary position encoding (RoPE), enabling dynamic modeling of relative positional information in visual feature sequences. Furthermore, we design a global dependency learning module to optimize long-range visual feature sequence modeling. Extensive experiments on publicly available datasets, including IU X-Ray and MIMIC-CXR, demonstrate that RRG-Mamba achieves a 3.7% improvement in BLEU-4 score over existing models, along with significant gains in computational and memory efficiency. Our code is available at https://github.com/Eleanorhxd/RRG-Mamba.

Keywords:

Multidisciplinary Topics and Applications: MTA: Health and medicine

Natural Language Processing: NLP: Language generation