Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Understanding the Role of Self Attention for Efficient Speech Recognition
Authors: Kyuhong Shim, Jungwook Choi, Wonyong Sung
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate this idea, we implement the layer-wise attention map reuse on real GPU platforms and achieve up to 1.96 times speedup in inference and 33% savings in training time with noticeably improved ASR performance for the challenging benchmark on Libri Speech dev/test-other dataset. |
| Researcher Affiliation | Academia | Kyuhong Shim1, Jungwook Choi2, Wonyong Sung1 Department of Electrical and Computer Engineering, Seoul National University1 Department of Electrical Engineering, Hanyang University2 |
| Pseudocode | No | The paper describes computational procedures and equations but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | We also provide the source code for the experiments in supplemental materials. |
| Open Datasets | Yes | We train and evaluate the model on the Libri Speech-960 (Panayotov et al., 2015) dataset. |
| Dataset Splits | Yes | We train and evaluate the model on the Libri Speech-960 (Panayotov et al., 2015) dataset. ... Table 2: Word error rate (%) for different attention map reuse configurations. ... dev-clean dev-other test-clean test-other |
| Hardware Specification | Yes | Inference speed is evaluated on a single RTX-Titan(24GB) GPU and training cost is measured in GPU-hours on A100(40GB) GPU. |
| Software Dependencies | No | The paper mentions software components and frameworks like Conformer, CTC, Sentence Piece, Adam W, MFA, Sync BN, Spec Augment, and SWA, but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | Please see Appendix A.1 and A.2 for the model configuration and training details. ... Table 4: Conformer-M implementation details. ... Table 5: Training details including optimizer, scheduler, augmentation and other hyper-parameters. |