Zero-Shot Audio Source Separation through Query-Based Learning from Weakly-Labeled Data
Authors: Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov4441-4449
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate the separation performance, we test our model on MUSDB18, while training on the disjoint Audio Set. We further verify the zero-shot performance by conducting another experiment on audio source types that are held-out from training. The model achieves comparable Source-to-Distortion Ratio (SDR) performance to current supervised models in both cases. |
| Researcher Affiliation | Collaboration | Ke Chen1*, Xingjian Du2*, Bilei Zhu2, Zejun Ma2, Taylor Berg-Kirkpatrick1, Shlomo Dubnov1 1 University of California San Diego, CA, USA 2 Bytedance AI Lab, Shanghai, China |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The official code is available in https://git.io/JDWQ5 |
| Open Datasets | Yes | We choose Audio Set to train our sound event detection system ST-SED. It is a largescale collection of over 2 million 10-sec audio samples and labeled with sound events from a set of 527 labels. (...) MUSDB18 (Rafii et al. 2017) (...) Audio Set (Gemmeke et al. 2017) (...) DESED test set (Serizel et al. 2020) |
| Dataset Splits | Yes | We train our audio separator in Audio Set full-train set, validate it in Audioset evaluation set, and evaluate it in MUSDB18 test set as following the 6th community-based Signal Separation Evaluation Campaign (Si SEC 2018). (...) During the validation stage, we follow the same sampling paradigm to construct 5096 audio pairs from Audioset evaluation set and fix these pairs. |
| Hardware Specification | Yes | We implement the ST-SED in Py Torch, train it with a batch size of 128 and the Adam W optimizer (...) in 8 NVIDIA Tesla V-100 GPUs in parallel. |
| Software Dependencies | No | The paper mentions "Py Torch" and "Adam W optimizer" but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | We implement the ST-SED in Py Torch, train it with a batch size of 128 and the Adam W optimizer (β1=0.9, β2=0.999, eps=1e-8, decay=0.05) (...) We adopt a warmup schedule by setting the learning rate as 0.05, 0.1, 0.2 in the first three epochs, then the learning rate is halved every ten epochs until it returns to 0.05. |