Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition
Authors: Cheng-I Jeff Lai, Yang Zhang, Alexander H. Liu, Shiyu Chang, Yi-Lun Liao, Yung-Sung Chuang, Kaizhi Qian, Sameer Khurana, David Cox, Jim Glass
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on low-resource ASR verify (1) sparse subnetworks exist in mono-lingual/multi-lingual pre-trained speech SSL, and (2) the computational advantage and performance gain of PARP over baseline pruning methods. |
| Researcher Affiliation | Collaboration | 1MIT CSAIL, 2MIT-IBM Watson AI Lab, 3National Taiwan University, 4UC Santa Barbara |
| Pseudocode | Yes | Algorithm 1 Prune-Adjust-Re-Prune (PARP) to target sparsity s |
| Open Source Code | Yes | Project webpage: https://people.csail.mit.edu/clai24/parp/ |
| Open Datasets | Yes | wav2vec 2.0 We took wav2vec 2.0 base (wav2vec2-base) and large (wav2vec2-large) pre-trained on Librispeech 960 hours [6]. |
| Dataset Splits | Yes | Our experimental setup can be found in Appendix 9. For Librispeech, we use the splits provided by wav2vec 2.0 [6]: 10min, 1h, 10h for low-resource finetuning, and dev-other, dev-clean, test-other, test-clean for evaluation. |
| Hardware Specification | Yes | We thank IBM for the donation to MIT of the Satori GPU cluster, and John Cohn for maintaining the cluster. |
| Software Dependencies | No | The paper mentions using fairseq and wav2vec 2.0 (implicitly software), but does not provide specific version numbers for these or any other software dependencies such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | All models are finetuned with Adam optimizer with learning rate 3e-5, weight decay 0.01, and 50k warm-up steps and then linearly decayed to 0. Batch size is 32. |