Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable
Authors: Shaojin Ding, Tianlong Chen, Zhangyang Wang
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted extensive experiments on CNN-LSTM, RNNTransducer, and Transformer models, and verified the existence of highly sparse winning tickets that can match the full model performance across those backbones. |
| Researcher Affiliation | Academia | Shaojin Ding1*, Tianlong Chen2*, Zhangyang Wang2 1Texas A&M University, 2 University of Texas at Austin EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Lottery Ticket Hypothesis Pruning 1: Set the initial mask m, with the weight initialization θ. 2: repeat 3: Rewind the weight to θ 4: Train f(x; m θ) for t epochs with algorithm AD t , i.e., AD t (f(x; m θ)) 5: Prune 20% of remaining weights in AD t (f(x; m θ)) and update m accordingly 6: until the sparsity of m reaches the desired sparsity level s 7: Return f(x; m θ). |
| Open Source Code | Yes | Codes are available at https://github.com/VITA-Group/Audio-Lottery. |
| Open Datasets | Yes | We conducted experiments on three commonly used ASR corpora: TED-LIUM (Rousseau et al., 2012), Common Voice (Ardila et al., 2020), and Libri Speech (Panayotov et al., 2015). |
| Dataset Splits | No | The paper mentions test sets but does not provide specific percentages or counts for training, validation, and test splits, nor does it explicitly reference a standard split that defines these proportions. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments, only discussing computational complexity in terms of MACs. |
| Software Dependencies | No | The paper mentions using PyTorch and lists several GitHub repositories for implementation bases and pruning libraries, but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | During training, we set the batch size to 32 and an initial learning rate to 0.0003, which is annealed down by a factor of 1.1 at the end of each epoch. All the models were trained for 16 epochs. |