CABIN: Debiasing Vision-Language Models Using Backdoor Adjustments

CABIN: Debiasing Vision-Language Models Using Backdoor Adjustments

Bo Pang, Tingrui Qiao, Caroline Walker, Chris Cunningham, Yun Sing Koh

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Main Track. Pages 484-492. https://doi.org/10.24963/ijcai.2025/55

Vision-language models (VLMs) have demonstrated strong zero-shot inference capabilities but may exhibit stereotypical biases toward certain demographic groups. Consequently, downstream tasks leveraging these models may yield unbalanced performance across different target social groups, potentially reinforcing harmful stereotypes. Mitigating such biases is critical for ensuring fairness in practical applications. Existing debiasing approaches typically rely on curated face-centric datasets for fine-tuning or retraining, risking overfitting and limiting generalisability. To address this issue, we propose a novel framework, CABIN (Causal Adjustment Based INtervention). It leverages a causal framework to identify sensitive attributes in images as confounding factors. Employing a learned mapper, which is trained on general large-scale image-text pairs rather than face-centric datasets, CABIN may use text to adjust sensitive attributes in the image embedding, ensuring independence between these sensitive attributes and image embeddings. This independence enables a backdoor adjustment for unbiased inference without the drawbacks of additional fine-tuning or retraining on narrowly tailored datasets. Through comprehensive experiments and analyses, we demonstrate that CABIN effectively mitigates biases and improves fairness metrics while preserving the zero-shot strengths of VLMs. The code is available at: https://github.com/ipangbo/causal-debias
Keywords:
AI Ethics, Trust, Fairness: ETF: Bias
AI Ethics, Trust, Fairness: ETF: Fairness and diversity
Computer Vision: CV: Machine learning for vision
Computer Vision: CV: Multimodal learning