reproducibilityindex.ai

Weakly-Supervised Temporal Localization via Occurrence Count Learning

Authors: Julien Schroeter, Kirill Sidorov, David Marshall

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate the effectiveness of our approach in a number of experiments (drum hit and piano onset detection in audio, digit detection in images) and demonstrate performance comparable to that of fully-supervised state-of-the-art methods, despite much weaker training requirements.
Researcher Affiliation	Academia	Julien Schroeter 1 Kirill Sidorov 1 David Marshall 1 1Cardiff University, United Kingdom. Correspondence to: Julien Schroeter <Schroeter J1@cardiff.ac.uk>.
Pseudocode	No	The paper describes the model and its components in text and mathematical formulas but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The full implementation and additional details can be found on the paper s website1. 1http://users.cs.cf.ac.uk/Schroeter J1/publications/Lo Co
Open Datasets	Yes	The model is evaluated on two different datasets: IDMT-SMT-Drums (Dittmar et al., 2014) and ENST Drums (Gillet & Richard, 2006). The MAPS database is used for this evaluation. As in (Hawthorne et al., 2017), the synthesized pieces are used for training, whereas the Disklavier pieces are used for testing. The well-known MNIST (Le Cun et al., 1998) dataset is used to generate samples for this experiment.
Dataset Splits	No	The paper discusses training and testing sets, and cross-validation, but does not explicitly define a “validation” split or set with specific percentages or counts for its models during training.
Hardware Specification	No	We gratefully acknowledge the support of NVIDIA Corporation with the donation of the GPU used for this research. This is the only mention of hardware, and it does not specify models or details for the experiments.
Software Dependencies	No	The paper mentions “LSTM (Hochreiter & Schmidhuber, 1997) or GRU (Cho et al., 2014)”, “Adam algorithm (Kingma & Ba, 2015)”, but no specific software dependencies with version numbers are listed.
Experiment Setup	Yes	First, the representation learning part of the network is composed of six (3 4) convolutional layers with 8 to 16 ﬁlters intertwined with max-pooling layers and Re LU activations. Secondly, the recurring unit is comprised of a 24-unit LSTM which is then directly followed by a ﬁnal 16-node fully-connected prediction layer. The Lo Co-loss described in Section 4.2 is optimized using the Adam algorithm (Kingma & Ba, 2015). (Tmax: 400, kmax: 31)