BESA: BERT-based Simulated Annealing for Adversarial Text Attacks

Xinghao Yang; Weifeng Liu; Dacheng Tao; Wei Liu

doi:10.24963/ijcai.2021/453

BESA: BERT-based Simulated Annealing for Adversarial Text Attacks

Xinghao Yang, Weifeng Liu, Dacheng Tao, Wei Liu

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence

Main Track. Pages 3293-3299. https://doi.org/10.24963/ijcai.2021/453

PDF BibTeX

Modern Natural Language Processing (NLP) models are known immensely brittle towards text adversarial examples. Recent attack algorithms usually adopt word-level substitution strategies following a pre-computed word replacement mechanism. However, their resultant adversarial examples are still imperfect in achieving grammar correctness and semantic similarities, which is largely because of their unsuitable candidate word selections and static optimization methods. In this research, we propose BESA, a BERT-based Simulated Annealing algorithm, to address these two problems. Firstly, we leverage the BERT Masked Language Model (MLM) to generate contextual-aware candidate words to produce fluent adversarial text and avoid grammar errors. Secondly, we employ Simulated Annealing (SA) to adaptively determine the word substitution order. The SA provides sufficient word replacement options via internal simulations, with an objective to obtain both a high attack success rate and a low word substitution rate. Besides, our algorithm is able to jump out of local optima with a controlled probability, making it closer to achieve the best possible attack (i.e., the global optima). Experiments on five popular datasets manifest the superiority of BESA compared with existing methods, including TextFooler, BAE, BERT-Attack, PWWS, and PSO.

Keywords:

Machine Learning: Adversarial Machine Learning

Natural Language Processing: Natural Language Processing