Enhancing Automated Grading in Science Education through LLM-Driven Causal Reasoning and Multimodal Analysis
Enhancing Automated Grading in Science Education through LLM-Driven Causal Reasoning and Multimodal Analysis
Haohao Zhu, Tingting Li, Peng He, Jiayu Zhou
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Human-Centred AI. Pages 10352-10360.
https://doi.org/10.24963/ijcai.2025/1150
Automated assessment of open responses in K–12 science education poses significant challenges due to the multimodal nature of student work, which often integrates textual explanations, drawings, and handwritten elements. Traditional evaluation methods that focus solely on textual analysis fail to capture the full breadth of student reasoning and are susceptible to biases such as handwriting neatness or answer length. In this paper, we propose a novel LLM-augmented multimodal evaluation framework that addresses these limitations through a comprehensive, bias-corrected grading system. Our approach leverages LLMs to generate causal knowledge graphs that encapsulate the essential conceptual relationships in student responses, comparing these graphs with those derived automatically from the rubrics and submissions. Experimental results demonstrate that our framework improves grading accuracy and consistency over deep supervised learning and few-shot LLM baselines.
Keywords:
IJCAI25: Human-Centred AI
