Rating AI Models for Robustness Through a Causal Lens
Rating AI Models for Robustness Through a Causal Lens
Kausik Lakkaraju
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Doctoral Consortium. Pages 10975-10976.
https://doi.org/10.24963/ijcai.2025/1242
AI models are increasingly accessible through chatbots and other applications, but their black-box nature and sensitivity to small changes in the input make them hard to interpret and trust. Existing correlation-based robustness metrics fail to explain model errors or isolate causal effects. To address this, I propose ARC (AI Rating through Causality), a causally-grounded framework for rating AI models based on their robustness. ARC evaluates robustness by quantifying statistical and confounding biases, as well as the impact of perturbations on model performance across diverse tasks. ARC produces interpretable raw scores and ratings, helping developers and users make informed decisions about model robustness. Two future directions include: (1) deriving raw scores for composite models from their component scores, and (2) combining ratings with traditional explainable AI approaches to provide a more holistic view of model behavior.
Keywords:
AI Ethics, Trust, Fairness: ETF: Trustworthy AI
Machine Learning: ML: Causality
