Probabilistic Sufficient Explanations

Probabilistic Sufficient Explanations

Eric Wang, Pasha Khosravi, Guy Van den Broeck

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
Main Track. Pages 3082-3088. https://doi.org/10.24963/ijcai.2021/424

Understanding the behavior of learned classifiers is an important task, and various black-box explanations, logical reasoning approaches, and model-specific methods have been proposed. In this paper, we introduce probabilistic sufficient explanations, which formulate explaining an instance of classification as choosing the "simplest" subset of features such that only observing those features is "sufficient" to explain the classification. That is, sufficient to give us strong probabilistic guarantees that the model will behave similarly when all features are observed under the data distribution. In addition, we leverage tractable probabilistic reasoning tools such as probabilistic circuits and expected predictions to design a scalable algorithm for finding the desired explanations while keeping the guarantees intact. Our experiments demonstrate the effectiveness of our algorithm in finding sufficient explanations, and showcase its advantages compared to Anchors and logical explanations.
Keywords:
Machine Learning: Explainable/Interpretable Machine Learning
AI Ethics, Trust, Fairness: Explainability
Uncertainty in AI: Exact Probabilistic Inference