Weakly Supervised Audio Source Separation via Spectrum Energy Preserved Wasserstein Learning
Authors: Ning Zhang, Junchi Yan, Yuchen Zhou
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that the proposed method performs competitively on public benchmark against state-of-the-art weakly supervised methods. We train our proposed model on the dev set and test the trained model on the test set. The overall performance (SDR) and the three individual metrics (SIR, SAR, ISR) for bass, drums, vocals separation task and the vocals vs. non-vocals separation task on the test part of DSD100. |
| Researcher Affiliation | Collaboration | 1 Shanghai Jiao Tong University, Shanghai, P.R. China 2 IBM Research China, Beijing, P.R. China |
| Pseudocode | Yes | Algorithm 1 Source Separation GAN (SSGAN): Energy preserved Wasserstein learning of audio source separation. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing the source code for their methodology, nor does it provide a link to such code. |
| Open Datasets | Yes | Dataset We term our method as source separation GAN (SSGAN). ... in Demixing Secrets Dataset (DSD100)1... 1http://liutkus.net/DSD100.zip |
| Dataset Splits | Yes | The DSD100 is divided into a dev set and a test set. Each of them consist of 50 songs. We train our proposed model on the dev set and test the trained model on the test set. |
| Hardware Specification | No | The paper does not specify any hardware used for running the experiments (e.g., GPU models, CPU types, or cloud instances). |
| Software Dependencies | No | The paper mentions using the 'Adam optimizer' and 'BSS Eval toolbox' but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | The parameters for the generator and discriminator networks were initialized randomly. We used the Adam optimizer [Kingma and Ba, 2014] with hyperparameters α = 0.0001, β1 = 0.5, β2 = 0.9 to train the generator and the discriminator, using a batch size of 16. The other parameters related to the model setup can be found in Table 1. |