Libra-CAM: An Activation-Based Attribution Based on the Linear Approximation of Deep Neural Nets and Threshold Calibration

Sangkyun Lee; Sungmin Han

doi:10.24963/ijcai.2022/442

Libra-CAM: An Activation-Based Attribution Based on the Linear Approximation of Deep Neural Nets and Threshold Calibration

Sangkyun Lee, Sungmin Han

Watch video

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence

Main Track. Pages 3185-3191. https://doi.org/10.24963/ijcai.2022/442

PDF BibTeX

Universal application of AI has increased the need to explain why an AI model makes a specific decision in a human-understandable form. Among many related works, the class activation map (CAM)-based methods have been successful recently, creating input attribution based on the weighted sum of activation maps in convolutional neural networks. However, existing methods use channel-wise importance weights with specific architectural assumptions, relying on arbitrarily chosen attribution threshold values in their quality assessment: we think these can degrade the quality of attribution. In this paper, we propose Libra-CAM, a new CAM-style attribution method based on the best linear approximation of the layer (as a function) between the penultimate activation and the target-class score output. From the approximation, we derive the base formula of Libra-CAM, which is applied with multiple reference activations from a pre-built library. We construct Libra-CAM by averaging these base attribution maps, taking a threshold calibration procedure to optimize its attribution quality. Our experiments show that Libra-CAM can be computed in a reasonable time and is superior to the existing attribution methods in quantitative and qualitative attribution quality evaluations.

Keywords:

Machine Learning: Explainable/Interpretable Machine Learning

Computer Vision: Interpretability and Transparency

AI Ethics, Trust, Fairness: Explainability and Interpretability

AI Ethics, Trust, Fairness: Trustworthy AI