InverseNet: Augmenting Model Extraction Attacks with Training Data Inversion

InverseNet: Augmenting Model Extraction Attacks with Training Data Inversion

Xueluan Gong, Yanjiao Chen, Wenbin Yang, Guanghao Mei, Qian Wang

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
Main Track. Pages 2439-2447. https://doi.org/10.24963/ijcai.2021/336

Cloud service providers, including Google, Amazon, and Alibaba, have now launched machine-learning-as-a-service (MLaaS) platforms, allowing clients to access sophisticated cloud-based machine learning models via APIs. Unfortunately, however, the commercial value of these models makes them alluring targets for theft, and their strategic position as part of the IT infrastructure of many companies makes them an enticing springboard for conducting further adversarial attacks. In this paper, we put forth a novel and effective attack strategy, dubbed InverseNet, that steals the functionality of black-box cloud-based models with only a small number of queries. The crux of the innovation is that, unlike existing model extraction attacks that rely on public datasets or adversarial samples, InverseNet constructs inversed training samples to increase the similarity between the extracted substitute model and the victim model. Further, only a small number of data samples with high confidence scores (rather than an entire dataset) are used to reconstruct the inversed dataset, which substantially reduces the attack cost. Extensive experiments conducted on three simulated victim models and Alibaba Cloud's commercially-available API demonstrate that InverseNet yields a model with significantly greater functional similarity to the victim model than the current state-of-the-art attacks at a substantially lower query budget.
Keywords:
Machine Learning: Adversarial Machine Learning
Machine Learning: Deep Learning
Multidisciplinary Topics and Applications: Security and Privacy