reproducibilityindex.ai

Understanding the Role of Self Attention for Efficient Speech Recognition

Authors: Kyuhong Shim, Jungwook Choi, Wonyong Sung

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate this idea, we implement the layer-wise attention map reuse on real GPU platforms and achieve up to 1.96 times speedup in inference and 33% savings in training time with noticeably improved ASR performance for the challenging benchmark on Libri Speech dev/test-other dataset.
Researcher Affiliation	Academia	Kyuhong Shim1, Jungwook Choi2, Wonyong Sung1 Department of Electrical and Computer Engineering, Seoul National University1 Department of Electrical Engineering, Hanyang University2
Pseudocode	No	The paper describes computational procedures and equations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	We also provide the source code for the experiments in supplemental materials.
Open Datasets	Yes	We train and evaluate the model on the Libri Speech-960 (Panayotov et al., 2015) dataset.
Dataset Splits	Yes	We train and evaluate the model on the Libri Speech-960 (Panayotov et al., 2015) dataset. ... Table 2: Word error rate (%) for different attention map reuse conﬁgurations. ... dev-clean dev-other test-clean test-other
Hardware Specification	Yes	Inference speed is evaluated on a single RTX-Titan(24GB) GPU and training cost is measured in GPU-hours on A100(40GB) GPU.
Software Dependencies	No	The paper mentions software components and frameworks like Conformer, CTC, Sentence Piece, Adam W, MFA, Sync BN, Spec Augment, and SWA, but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	Please see Appendix A.1 and A.2 for the model conﬁguration and training details. ... Table 4: Conformer-M implementation details. ... Table 5: Training details including optimizer, scheduler, augmentation and other hyper-parameters.