reproducibilityindex.ai

Learning to Separate Voices by Spatial Regions

Authors: Alan Xu, Romit Roy Choudhury

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Results show promising performance, underscoring the importance of personalization over a generic supervised approach.
Researcher Affiliation	Academia	1Department of Electrical and Computer Engineering, University of Illinois Urbana-Champaign, Illinois, US.
Pseudocode	Yes	Algorithm 1 presents the pseudo code; we explain the key steps below.
Open Source Code	No	The paper mentions "audio samples available at our project website1). 1https://uiuc-earable-computing.github. io/binaural/", but this link is specified for audio samples, not for the source code of the methodology. There is no explicit statement about releasing the source code.
Open Datasets	Yes	For supervised region-based separation, we use the CIPIC HRTF database (Algazi et al., 2001). ... We use the Libri Mix dataset (Cosentino et al., 2020), sampled at 16KHz...
Dataset Splits	Yes	With the script used in (Dovrat et al., 2021), Libri5Mix is used for training and validation, while Libri2Mix, Libri3Mix, Libri4Mix, and Libri5Mix are used for testing.
Hardware Specification	Yes	The model is trained on 4 1080ti GPUs using the ADAM optimizer with batch size 4.
Software Dependencies	No	The paper mentions software components like "ADAM optimizer", "STFT", "Hanning window", and refers to "feature concatenation Tas Net" but does not specify version numbers for any libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used.
Experiment Setup	Yes	To configure the feature concatenation Tas Net, we set N = 512, L = 32, B = 128, Sc = 128, P = 3, X = 8, R = 3, following the convention in (Luo & Mesgarani, 2019). ... We set faliasing = 562Hz, which is about the 36th bin in the FFT. We set α = 5, σth = 0.00007 second this value was set empirically based on our discussion on Figure 4.