Enabling Fast and Universal Audio Adversarial Attack Using Generative Model

Authors: Yi Xie, Zhuohang Li, Cong Shi, Jian Liu, Yingying Chen, Bo Yuan14129-14137

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on DNN-based audio systems show that our proposed FAPG can achieve high success rate with up to 214 speedup over the existing audio adversarial attack methods.
Researcher Affiliation Academia 1Rutgers University 2The University of Tennessee, Knoxville
Pseudocode Yes Algorithm 1: Training Procedure of FAPG; Algorithm 2: Training Procedure of UAPG
Open Source Code No The paper does not provide an explicit statement or link for the open-source code for the methodology described in this paper. The link 'https://kaldi-asr.org/models/m3' refers to a third-party model used, not the authors' own implementation.
Open Datasets Yes Google Speech Commands dataset (Warden 2018), speaker recognition model on VCTK dataset (Christophe, Junichi, and Kirsten 2016) and environmental sound classification model on Urban Sound8k dataset (Salamon, Jacoby, and Bello 2014).
Dataset Splits Yes The dataset is split into training, validation and testing set with a ratio of 8 : 1 : 1.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions using 'Adam optimizer' and 'Wave-U-Net', but does not provide specific version numbers for software dependencies.
Experiment Setup Yes A total of 10,000 training steps are conducted using Adam optimizer with the batch size of 100. The initial learning rate is set to 1e 4 and gradually decayed to 1e 6. β is set as 0.1 for all dataset. τ is initially set as 0.1 and reduces to 0.05 and 0.03 at step of 3,000 and 7,000 for command recognition and speaker recognition, and it stops reducing as 0.05 for sound classification model.