reproducibilityindex.ai

D4AM: A General Denoising Framework for Downstream Acoustic Models

Authors: Chi-Chang Lee, Yu Tsao, Hsin-Min Wang, Chu-Song Chen

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results show that D4AM can consistently and effectively provide improvements to various unseen acoustic models and outperforms other combination setups.
Researcher Affiliation	Academia	1National Taiwan University, Taipei, Taiwan 2Academia Sinica, Taipei, Taiwan
Pseudocode	Yes	Algorithm 1 D4AM (A General Denoising Framework for Downstream Acoustic Models)
Open Source Code	Yes	Our code is available at https://github.com/Chang Lee0903/D4AM.
Open Datasets	Yes	The training datasets used in this study include: noise signals from DNS-Challenge (Reddy et al., 2020) and speech utterances from Libri Speech (Panayotov et al., 2015).
Dataset Splits	Yes	For CHi ME-4, we evaluated the performance on the development and test sets of the 1-st track.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, memory, or cloud instance types used for running the experiments.
Software Dependencies	No	The paper mentions software like the 'Speech Brain toolkit' and 'DEMUCS' and the 'Adam optimizer', but it does not specify any version numbers for these software components or libraries.
Experiment Setup	Yes	For pre-training, we only used Lreg to train the SE unit. At this stage, we selected Libri-360 and Libri-500 as the clean speech corpus. The SE unit was trained for 500,000 steps using the Adam optimizer with β1 = 0.9 and β2 = 0.999, learning rate 0.0002, gradient clipping value 1, and batch size 8. For fine-tuning, we used both Lreg and Lϕ cls to re-train the SE model initialized by the checkpoint selected from the pre-training stage. ... The SE unit was trained for 100,000 steps with the Adam optimizer with β1 = 0.9 and β2 = 0.999, learning rate 0.0001, gradient clipping value 1, and batch size 16.