reproducibilityindex.ai

Continual Audio-Visual Sound Separation

Authors: Weiguo Pian, Yiyang Nan, Shijian Deng, Shentong Mo, Yunhui Guo, Yapeng Tian

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that Cont AV-Sep can effectively mitigate catastrophic forgetting and achieve significantly better performance compared to other continual learning baselines for audio-visual sound separation. Code is available at: https://github.com/weiguo Pian/Cont AV-Sep_Neur IPS2024. ... In this section, we first introduce the setup of our experiments, i.e., dataset, baselines, evaluation metrics, and the implementation details.
Researcher Affiliation	Academia	1 The University of Texas at Dallas 2 Brown University 3 Carnegie Mellon University
Pseudocode	No	The paper describes its methods in text and uses mathematical formulations but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at: https://github.com/weiguo Pian/Cont AV-Sep_Neur IPS2024.
Open Datasets	Yes	Following common practice [83, 88, 14], we conducted experiments on MUSIC-21 [83]... To further validate the efficacy of our method across a broader sound domain, we conduct experiments using the AVE [68] and the VGGSound [13] datasets in the appendix.
Dataset Splits	Yes	we randomly split them into training, validation, and testing sets with 840, 100, and 100 videos, respectively.
Hardware Specification	Yes	We train our proposed method and all baselines on a NVIDIA RTX A5000 GPU.
Software Dependencies	No	The paper mentions software like PyTorch [51], Detic [87], CLIP [56], and Video MAE [69], but it does not provide specific version numbers for these key software components as required for reproducibility.
Experiment Setup	Yes	In our proposed Cross-modal Similarity Distillation Constraint (Cross SDC), the balance weights λins and λcls are set to 0.1 and 0.3, respectively. And the balance weight λdist. for the output distillation loss is set to 0.3 in our experiments. For the memory set, we set the number of samples in each old class to 1, so as other baselines that involve the memory set.