Generating Natural Counterfactual Visual Explanations

Generating Natural Counterfactual Visual Explanations

Wenqi Zhao, Satoshi Oyama, Masahito Kurihara

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Doctoral Consortium. Pages 5204-5205. https://doi.org/10.24963/ijcai.2020/742

Counterfactual explanations help users to understand the behaviors of machine learning models by changing the inputs for the existing outputs. For an image classification task, an example counterfactual visual explanation explains: "for an example that belongs to class A, what changes do we need to make to the input so that the output is more inclined to class B." Our research considers changing the attribute description text of class A on the basis of the attributes of class B and generating counterfactual images on the basis of the modified text. We can use the prediction results of the model on counterfactual images to find the attributes that have the greatest effect when the model is predicting classes A and B. We applied our method to a fine-grained image classification dataset and used the generative adversarial network to generate natural counterfactual visual explanations. To evaluate these explanations, we used them to assist crowdsourcing workers in an image classification task. We found that, within a specific range, they improved classification accuracy.
Keywords:
Machine Learning: Interpretability
Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation