Counterfactual Fairness: Unidentification, Bound and Algorithm

Counterfactual Fairness: Unidentification, Bound and Algorithm

Yongkai Wu, Lu Zhang, Xintao Wu

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 1438-1444. https://doi.org/10.24963/ijcai.2019/199

Fairness-aware learning studies the problem of building machine learning models that are subject to fairness requirements. Counterfactual fairness is a notion of fairness derived from Pearl's causal model, which considers a model is fair if for a particular individual or group its prediction in the real world is the same as that in the counterfactual world where the individual(s) had belonged to a different demographic group. However, an inherent limitation of counterfactual fairness is that it cannot be uniquely quantified from the observational data in certain situations, due to the unidentifiability of the counterfactual quantity. In this paper, we address this limitation by mathematically bounding the unidentifiable counterfactual quantity, and develop a theoretically sound algorithm for constructing counterfactually fair classifiers. We evaluate our method in the experiments using both synthetic and real-world datasets, as well as compare with existing methods. The results validate our theory and show the effectiveness of our method.
Keywords:
Humans and AI: Ethical Issues in AI
Machine Learning: Classification
Knowledge Representation and Reasoning: Action, Change and Causality