A Logic-based Approach to Contrastive Explainability for Neurosymbolic Visual Question Answering

A Logic-based Approach to Contrastive Explainability for Neurosymbolic Visual Question Answering

Thomas Eiter, Tobias Geibinger, Nelson Higuera, Johannes Oetsch

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 3668-3676. https://doi.org/10.24963/ijcai.2023/408

Visual Question Answering (VQA) is a well-known problem for which deep-learning is key. This poses a challenge for explaining answers to questions, the more if advanced notions like contrastive explanations (CEs) should be provided. The latter explain why an answer has been reached in contrast to a different one and are attractive as they focus on reasons necessary to flip a query answer. We present a CE framework for VQA that uses a neurosymbolic VQA architecture which disentangles perception from reasoning. Once the reasoning part is provided as logical theory, we use answer-set programming, in which CE generation can be framed as an abduction problem. We validate our approach on the CLEVR dataset, which we extend by more sophisticated questions to further demonstrate the robustness of the modular architecture. While we achieve top performance compared to related approaches, we can also produce CEs for explanation, model debugging, and validation tasks, showing the versatility of the declarative approach to reasoning.
Keywords:
Machine Learning: ML: Neuro-symbolic methods
Knowledge Representation and Reasoning: KRR: Logic programming
Machine Learning: ML: Explainable/Interpretable machine learning