reproducibilityindex.ai

DiffSED: Sound Event Detection with Denoising Diffusion

Authors: Swapnil Bhosale, Sauradip Nag, Diptesh Kanojia, Jiankang Deng, Xiatian Zhu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on the Urban-SED and EPIC-Sounds datasets demonstrate that our model significantly outperforms existing alternatives, with 40+% faster convergence in training.
Researcher Affiliation	Academia	1University of Surrey, UK 2Imperial College London, UK
Pseudocode	Yes	Algorithm 1: Training; Algorithm 2: Noise corruption
Open Source Code	Yes	Code: https://github.com/Surrey-UPLab/Diff SED.
Open Datasets	Yes	Extensive experiments on the Urban-SED and EPIC-Sounds datasets... URBAN-SED (Salamon, Jacoby, and Bello 2014)... EPIC-Sounds (Huh et al. 2023).
Dataset Splits	Yes	Figure 3: Convergence rates for SEDT and Diff SED on the URBAN-SED dataset. The dotted lines represent the training epoch when the best-performing checkpoint (the one with the best audio-tagging F1 score on the validation set) arrived... For the EPIC-Sounds dataset, we report the top-1 and top-5 accuracy, as well as mean average precision (m AP), mean area under ROC curve (m AUC), and mean per class accuracy (m CA) on the validation split, following the protocol of (Huh et al. 2023).
Hardware Specification	Yes	All models are trained with 2 NVIDIAA5500 GPUs.
Software Dependencies	No	The paper mentions 'Adam optimizer' and 'Res Net-50' and implies the use of a deep learning framework, but it does not specify versions for any software components, such as Python, PyTorch/TensorFlow, or other libraries.
Experiment Setup	Yes	Our model is trained for 400 epochs, while re-initializing the weights from the best checkpoint for every 100 epochs, using Adam optimizer with an initial learning rate of 10 4 with a decay schedule of 10 2. The batch size is set to 64 for URBAN-SED and 128 for EPIC-Sounds.