AudioQR: Deep Neural Audio Watermarks For QR Code

Xinghua Qu; Xiang Yin; Pengfei Wei; Lu Lu; Zejun Ma

doi:10.24963/ijcai.2023/687

AudioQR: Deep Neural Audio Watermarks For QR Code

Xinghua Qu, Xiang Yin, Pengfei Wei, Lu Lu, Zejun Ma

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence

AI for Good. Pages 6192-6200. https://doi.org/10.24963/ijcai.2023/687

PDF BibTeX

Image-based quick response (QR) code is frequently used, but creates barriers for the visual impaired people. With the goal of ``AI for good", this paper proposes the AudioQR, a barrier-free QR coding mechanism for the visually impaired population via deep neural audio watermarks. Previous audio watermarking approaches are mainly based on handcrafted pipelines, which is less secure and difficult to apply in large-scale scenarios. In contrast, AudioQR is the first comprehensive end-to-end pipeline that hides watermarks in audio imperceptibly and robustly. To achieve this, we jointly train an encoder and decoder, where the encoder is structured as a concatenation of transposed convolutions and multi-receptive field fusion modules. Moreover, we customize the decoder training with a stochastic data augmentation chain to make the watermarked audio robust towards different audio distortions, such as environment background, room impulse response when playing through the air, music surrounding, and Gaussian noise. Experiment results indicate that AudioQR can efficiently hide arbitrary information into audio without introducing significant perceptible difference. Our code is available at https://github.com/xinghua-qu/AudioQR.

Keywords:

AI for Good: Multidisciplinary Topics and Applications

AI for Good: Machine Learning