Relation-Based Counterfactual Explanations for Bayesian Network Classifiers

Relation-Based Counterfactual Explanations for Bayesian Network Classifiers

Emanuele Albini, Antonio Rago, Pietro Baroni, Francesca Toni

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Main track. Pages 451-457. https://doi.org/10.24963/ijcai.2020/63

We propose a general method for generating counterfactual explanations (CFXs) for a range of Bayesian Network Classifiers (BCs), e.g. single- or multi-label, binary or multidimensional. We focus on explanations built from relations of (critical and potential) influence between variables, indicating the reasons for classifications, rather than any probabilistic information. We show by means of a theoretical analysis of CFXs’ properties that they serve the purpose of indicating (potentially) pivotal factors in the classification process, whose absence would give rise to different classifications. We then prove empirically for various BCs that CFXs provide useful information in real world settings, e.g. when race plays a part in parole violation prediction, and show that they have inherent advantages over existing explanation methods in the literature.
Keywords:
AI Ethics: Explainability
Uncertainty in AI: Bayesian Networks