Enhancing Transferability of Audio Adversarial Example for Both Frequency- and Time-domain

Enhancing Transferability of Audio Adversarial Example for Both Frequency- and Time-domain

Zilin Tian, Yunfei Long, Liguo Zhang, Jiahong Zhao

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Main Track. Pages 6263-6271. https://doi.org/10.24963/ijcai.2025/697

Audio adversarial examples impose acoustically imperceptible perturbations to clean audio examples, fooling classification models into producing incorrect results. Transferability is a critical property of audio adversarial examples, making black-box attacks applicable in practice and attracting increasing interest. Despite recent studies achieving transferability across models within the same domain, they consistently fail to achieve transferability across different domains. Given that time-domain and frequency-domain models are the two predominant approaches in audio classification, we observe that adversarial examples generated for one domain demonstrate significantly constrained transferability to the other. To address this limitation, we propose an Adaptive Inter-domain Ensemble (AIE) attack, which integrates transferable adversarial information from both domains and dynamically optimizes their contributions through adaptive weighting, improving the cross-domain transferability of audio adversarial examples. Extensive evaluations on diverse datasets consistently demonstrate that AIE outperforms existing methods, establishing its effectiveness in enhancing adversarial transferability across domains.
Keywords:
Machine Learning: ML: Adversarial machine learning
Natural Language Processing: NLP: Speech