reproducibilityindex.ai

Generating Robust Audio Adversarial Examples with Temporal Dependency

Authors: Hongting Zhang, Pan Zhou, Qiben Yan, Xiao-Yang Liu

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results and user study both suggest that the generated adversarial examples can signiﬁcantly reduce human-perceptible noises and resist the defenses based on the temporal structure. We implement a successful attack on the latest model of an end-to-end CNN based ASR system Wav2letter+ with a differentiable Mel Frequency Cepstral Coefﬁcient (MFCC) features extraction. Experimental results show that the adversarial examples are effective even under temporal dependency based defense (TD defense). User study shows that our adversarial examples have the highest audio quality so far.
Researcher Affiliation	Academia	Hongting Zhang1 , Qiben Yan2 , Pan Zhou1 and Xiao-Yang Liu3 1Huazhong University of Science and Technology 2Michigan State University 3Columbia University htzhang@hust.edu.cn, qyan@msu.edu, panzhou@hust.edu.cn, xl2427@columbia.edu
Pseudocode	Yes	The pseudocode of the proposed algorithm is presented in Algorithm 1.
Open Source Code	No	The paper does not provide any statement about releasing the source code for its described methodology, nor does it include a link to a code repository.
Open Datasets	Yes	Libri Speech [Panayotov et al., 2015] is a corpus of approximately 1,000 hours of 16 KHz English speech derived from audiobooks from the Libri Vox project.
Dataset Splits	Yes	It comes with its own training, validation sets, test-clean and test-other sets. We use all available samples to train and validate our ASR system.
Hardware Specification	Yes	All experiments are carried out on an Ubuntu Server (16.04 LTS) with an Intel Core i5-6500@ 3.20GHz 4, 16G Memory and GTX 1080 GPU.
Software Dependencies	No	The paper mentions "we implement Wav2letter+ in Pytorch as our adversarial model." but does not specify version numbers for PyTorch or any other software dependencies.
Experiment Setup	Yes	In our experiments, we set the learning rate as 1e 5 in the ﬁrst stage and 5e 5 in the second stage. To strike a balance between epochs and distortion, we set the width B to 0.2 in all the following experiments to generate adversarial examples with a high quality while reducing runtime cost.