Poisoning-based Backdoor Attacks for Arbitrary Target Label with Positive Triggers

Binxiao Huang; Ngai Wong

doi:10.24963/ijcai.2025/127

Poisoning-based Backdoor Attacks for Arbitrary Target Label with Positive Triggers

Binxiao Huang, Ngai Wong

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence

Main Track. Pages 1134-1142. https://doi.org/10.24963/ijcai.2025/127

PDF BibTeX

Poisoning-based backdoor attacks expose vulnerabilities during the data preparation phase of deep neural network (DNN) training. The DNNs trained on the poisoned dataset will be embedded with a backdoor, making them behave well on clean data while outputting malicious predictions whenever a trigger is applied. To exploit the abundant information contained in the input-to-label mapping, our scheme utilizes the network trained from the clean dataset as a trigger generator to produce poisons that significantly raise the success rate of backdoor attacks versus conventional approaches. Specifically, we introduce a new categorization of triggers inspired by adversarial techniques and propose a multi-label and multi-payload Poisoning-based backdoor attack with Positive Triggers (PPT), which strategically manipulates inputs to align them closer to the target label in the feature space of benign classifiers. Once the classifier is trained on the poisoned dataset, we can generate an input-label-aware trigger to make the infected classifier predict any given input to any target label with a high possibility. Through extensive experiments under both dirty-label and clean-label settings, we demonstrate empirically that the proposed attack achieves a high attack success rate without sacrificing accuracy across various datasets, including SVHN, CIFAR10, GTSRB, and Tiny ImageNet. Additionally, the PPT attack can elude a variety of classical backdoor defenses, proving its effectiveness.

Keywords:

Computer Vision: CV: Other

Computer Vision: CV: Low-level Vision