Counterfactual Interpolation Augmentation (CIA): A Unified Approach to Enhance Fairness and Explainability of DNN

Counterfactual Interpolation Augmentation (CIA): A Unified Approach to Enhance Fairness and Explainability of DNN

Yao Qiang, Chengyin Li, Marco Brocanelli, Dongxiao Zhu

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
Main Track. Pages 732-739. https://doi.org/10.24963/ijcai.2022/103

Bias in the training data can jeopardize fairness and explainability of deep neural network prediction on test data. We propose a novel bias-tailored data augmentation approach, Counterfactual Interpolation Augmentation (CIA), attempting to debias the training data by d-separating the spurious correlation between the target variable and the sensitive attribute. CIA generates counterfactual interpolations along a path simulating the distribution transitions between the input and its counterfactual example. CIA as a pre-processing approach enjoys two advantages: First, it couples with either plain training or debiasing training to markedly increase fairness over the sensitive attribute. Second, it enhances the explainability of deep neural networks by generating attribution maps via integrating counterfactual gradients. We demonstrate the superior performance of the CIA-trained deep neural network models using qualitative and quantitative experimental results. Our code is available at: https://github.com/qiangyao1988/CIA
Keywords:
AI Ethics, Trust, Fairness: Fairness & Diversity
AI Ethics, Trust, Fairness: Explainability and Interpretability