SoundDet: Polyphonic Moving Sound Event Detection and Localization from Raw Waveform
Authors: Yuhang He, Niki Trigoni, Andrew Markham
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on the public DCASE dataset show the advantage of Sound Det on both segment-based and our newly proposed event-based evaluation system. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Oxford, Oxford, United Kingdom. Email: firstname.lastname@cs.ox.ac.uk. |
| Pseudocode | No | The paper provides an architectural illustration in Table 4 and describes components, but does not include structured pseudocode or an algorithm block labeled as such. |
| Open Source Code | No | The paper does not contain any statement about releasing source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We evaluate Sound Det on TAU-NIGENS DCASE sound event detection and localization (SELD)(Adavanne et al., 2018) dataset. |
| Dataset Splits | Yes | We follow the official splits and use 1-6 folds for train and the remaining 7-8 folds (200 1-min recordings) for test. |
| Hardware Specification | Yes | We further report the average inference time of different methods to process a one-minute long audio in Table. 3, showing that it is almost twice as fast as EIN, and comparable to SELDNet. Inference time on Intel(R) Core(TM) i9-7920X CPU. |
| Software Dependencies | No | The paper describes optimizers (SGD), learning rates, and network architecture, but does not specify software dependencies (e.g., libraries, frameworks) with version numbers. |
| Experiment Setup | Yes | For the backbone training, we use SGD optimizer with an initial learning rate 0.5, the learning rate decays every 30 epochs with decay rate 0.7. ... H and W indicate dense proposal map height and width, respectively, in our experiment H = 60 and W = 60 |