reproducibilityindex.ai

Learning to Recognize Transient Sound Events using Attentional Supervision

Authors: Szu-Yu Chou, Jyh-Shing Jang, Yi-Hsuan Yang

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that M&mnet works remarkably well for recognizing sound events, establishing a new state-of-the-art for DCASE17 and Audio Set data sets.
Researcher Affiliation	Academia	1Graduate Institute of Networking and Multimedia, National Taiwan University, Taipei, Taiwan 2Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	Yes	For reproducibility, we will share the python source code and trained models online through a github repo.4 https://github.com/fearofchou/mmnet
Open Datasets	Yes	The ﬁrst set of experiments uses DCASE17, a subset of Audio Set that was used in DCASE2017 Challenge Task 4 [Mesaros et al., 2017]. ...The second set of experiments uses Audio Set, containing over 2M audio clips with 527 possible sound events. Google provides a balanced training set with at least 59 examples per class, called Audio Set-22K, and a balanced test set with again at least 59 examples per class, called Audio Set-20K.
Dataset Splits	No	The paper mentions training and test sets but does not explicitly describe a separate validation split with specific percentages, counts, or a defined method for creating it.
Hardware Specification	No	The paper does not specify any hardware details such as CPU, GPU models, or memory.
Software Dependencies	No	The paper mentions using the 'librosa library' but does not provide a specific version number. No other software dependencies with version numbers are listed.
Experiment Setup	Yes	For optimization, we used SGD with a mini-batch size of 64 and initial learning rate 0.1. We divided the learning rate by 10 every 30 epochs and set the maximal number of epochs to 100. To avoid overﬁtting, we set the weight decay to 1e-4.