SAP: Privacy-Preserving Fine-Tuning on Language Models with Split-and-Privatize Framework
SAP: Privacy-Preserving Fine-Tuning on Language Models with Split-and-Privatize Framework
Xicong Shen, Yang Liu, Yi Liu, Peiran Wang, Huiqi Liu, Jue Hong, Bing Duan, Zirui Huang, Yunlong Mao, Ye Wu, Sheng Zhong
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Main Track. Pages 502-510.
https://doi.org/10.24963/ijcai.2025/57
Pre-trained Language Models (PLM) have enabled a cost-effective approach to handling various downstream applications via Parameter-Efficient-Fine-Tuning (PEFT) techniques. In this context, service providers have introduced a popular fine-tuning-based product service known as Model-as-a-Service (MaaS). This service offers users access to extensive PLMs and training resources. With MaaS, users can fine-tune, deploy, and utilize their customized models seamlessly, leveraging a one-stop platform that allows them to work with their private datasets efficiently. However, this service paradigm has recently been exposed to the possibility of leaking user private data. To this end, we identify the data privacy leakage risks in MaaS-based PEFT and propose a Split-and-Privatize (SAP) framework, mitigating the privacy leakage by integrating split learning and differential privacy into MaaS PEFT. Furthermore, we propose Contributing-Token-Identification (CTI), a novel method to balance model utility degradation and privacy leakage. As a result, the proposed framework is comprehensively evaluated, demonstrating a 65% improvement in empirical privacy with only a 1% degradation in model performance on the Stanford Sentiment Treebank dataset, outperforming existing state-of-the-art baselines.
Keywords:
AI Ethics, Trust, Fairness: ETF: Trustworthy AI
Machine Learning: ML: Federated learning
Machine Learning: ML: Trustworthy machine learning
Multidisciplinary Topics and Applications: MTA: Security and privacy
