Weakly Supervised Audio Source Separation via Spectrum Energy Preserved Wasserstein Learning

Authors: Ning Zhang, Junchi Yan, Yuchen Zhou

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that the proposed method performs competitively on public benchmark against state-of-the-art weakly supervised methods. We train our proposed model on the dev set and test the trained model on the test set. The overall performance (SDR) and the three individual metrics (SIR, SAR, ISR) for bass, drums, vocals separation task and the vocals vs. non-vocals separation task on the test part of DSD100.
Researcher Affiliation Collaboration 1 Shanghai Jiao Tong University, Shanghai, P.R. China 2 IBM Research China, Beijing, P.R. China
Pseudocode Yes Algorithm 1 Source Separation GAN (SSGAN): Energy preserved Wasserstein learning of audio source separation.
Open Source Code No The paper does not contain an explicit statement about releasing the source code for their methodology, nor does it provide a link to such code.
Open Datasets Yes Dataset We term our method as source separation GAN (SSGAN). ... in Demixing Secrets Dataset (DSD100)1... 1http://liutkus.net/DSD100.zip
Dataset Splits Yes The DSD100 is divided into a dev set and a test set. Each of them consist of 50 songs. We train our proposed model on the dev set and test the trained model on the test set.
Hardware Specification No The paper does not specify any hardware used for running the experiments (e.g., GPU models, CPU types, or cloud instances).
Software Dependencies No The paper mentions using the 'Adam optimizer' and 'BSS Eval toolbox' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes The parameters for the generator and discriminator networks were initialized randomly. We used the Adam optimizer [Kingma and Ba, 2014] with hyperparameters α = 0.0001, β1 = 0.5, β2 = 0.9 to train the generator and the discriminator, using a batch size of 16. The other parameters related to the model setup can be found in Table 1.