Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition
Authors: Yao Qin, Nicholas Carlini, Garrison Cottrell, Ian Goodfellow, Colin Raffel
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 7. Evaluation, 7.1. Datasets and Evaluation Metrics, Table 1. Sentence-level accuracy and WER for 1000 clean and (imperceptible) adversarially perturbed examples, fed without over-the-air simulation into the Lingvo model., Figure 1. Results of human study for imperceptibility. |
| Researcher Affiliation | Collaboration | 1Department of CSE, University of California, San Diego, USA 2Google Brain, USA. |
| Pseudocode | No | No explicit pseudocode or algorithm blocks are present in the main body of the paper. |
| Open Source Code | No | The project webpage is at http://cseweb.ucsd.edu/ yaq007/imperceptible-robust-adv.html (This links to a general project page, not explicitly stating code availability). |
| Open Datasets | Yes | We use the Libri Speech dataset (Panayotov et al., 2015) in our experiments, which is a corpus of 16KHz English speech derived from audiobooks and is used to train the Lingvo system (Shen et al., 2019). |
| Dataset Splits | No | We randomly select 1000 audio examples as source examples, and 1000 separate transcriptions from the test-clean dataset to be the targeted transcriptions. This describes data selection for their evaluation, not explicit train/validation/test splits for model training or their attack algorithm development. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned in the paper. |
| Software Dependencies | No | No specific software dependencies with version numbers were mentioned in the paper. |
| Experiment Setup | Yes | We initially set ϵ to a large value and then gradually reduce it during optimization following Carlini & Wagner (2018). and The parameter α that balances the network loss ℓnet(f(x + δ), y) and the imperceptibility loss ℓθ(x, y) is initialized with a small value, e.g., 0.05, and is adaptively updated according to the performance of the attack. |