Model Rake: A Defense Against Stealing Attacks in Split Learning

Qinbo Zhang; Xiao Yan; Yanfeng Zhao; Fangcheng Fu; Quanqing Xu; Yukai Ding; Xiaokai Zhou; Chuang Hu; Jiawei Jiang

doi:10.24963/ijcai.2025/779

Model Rake: A Defense Against Stealing Attacks in Split Learning

Qinbo Zhang, Xiao Yan, Yanfeng Zhao, Fangcheng Fu, Quanqing Xu, Yukai Ding, Xiaokai Zhou, Chuang Hu, Jiawei Jiang

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence

Main Track. Pages 7002-7010. https://doi.org/10.24963/ijcai.2025/779

PDF BibTeX

Split learning is a prominent framework for vertical federated learning, where multiple clients collaborate with a central server for model training by exchanging intermediate embeddings. Recently, it is shown that an adversarial server can exploit the intermediate embeddings to train surrogate models to replace the bottom models on the clients (i.e., model stealing). The surrogate models can also be used to reconstruct private training data of the clients (i.e., data stealing). To defend against these stealing attacks, we propose Model Rake (i.e., Rake), which runs two bottom models on each client and differentiates their output spaces to make the two models distinct. Rake hinders the stealing attacks because it is difficult for a surrogate model to approximate two distinct bottom models. We prove that, under some assumptions, the surrogate model converges to the average of the two bottom models and thus will be inaccurate. Extensive experiments show that Rake is much more effective than existing methods in defending against both model and data stealing attacks, and the accuracy of normal model training is not affected.

Keywords:

Machine Learning: ML: Federated learning

Machine Learning: ML: Trustworthy machine learning