reproducibilityindex.ai

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Authors: Xiaolin Hu, Kai Li, Weiyi Zhang, Yi Luo, Jean-Marie Lemercier, Timo Gerkmann

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments showed that this asynchronous updating scheme achieved signiﬁcantly better results with much fewer parameters than the traditional synchronous updating scheme. In addition, the proposed model achieved good balance between speech separation accuracy and computational efﬁciency as compared to other state-of-the-art models on three benchmark datasets.
Researcher Affiliation	Academia	1Department of Computer Science and Technology, Tsinghua Laboratory of Brain and Intelligence (THBI), IDG/Mc Govern Institute of Brain Research Tsinghua University, Beijing, China 2Department of Electrical Engineering, Columbia University, NY, USA 3Department of Informatics, University of Hamburg, Hamburg, Germany
Pseudocode	No	The paper describes the model architecture and updating schemes using diagrams (Figure 1, Figure 3), but it does not provide any pseudocode or algorithm blocks.
Open Source Code	Yes	The Pytorch implementation of the models is publicly available . It is based on the code of Su Do RM-RF . This project is MIT Licensed. (Footnote links to https://cslikai.cn/project/AFRCNN)
Open Datasets	Yes	Libri2Mix [2]. This dataset was constructed using train-100, train-360, dev, and test set in the Libri Speech dataset [25]. ... WSJ0-2Mix [7]. This dataset contains a 30-hour training set, a 10-hour validation set and a 5-hour test set. ... WHAM! [37]. WHAM! added noise to WSJ0-2Mix
Dataset Splits	Yes	WSJ0-2Mix [7]. This dataset contains a 30-hour training set, a 10-hour validation set and a 5-hour test set.
Hardware Specification	Yes	All experiments were conducted on a server with Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz and Ge Force RTX 1080 Ti 11G 8.
Software Dependencies	No	The paper mentions 'The Pytorch implementation of the models is publicly available', but it does not specify the version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	We trained all models for 200 epochs on 3-second utterances for Libri2Mix and 4-second utterances for WHAM! and WSJ0-2Mix with 8K Hz sampling rate. Batch size was set to 8. The initial learning rate of Adam optimizer was 1 10 3, and it decayed to 1/3 of the previous rate every 40 epochs. During training, gradient clipping with a maximum l2-norm of 5 was used.