reproducibilityindex.ai

Skipping the Frame-Level: Event-Based Piano Transcription With Neural Semi-CRFs

Authors: Yujia Yan, Frank Cwitkowitz, Zhiyao Duan

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on the MAESTRO dataset and demonstrate that the proposed model surpasses the current state-of-the-art for piano transcription. Our results suggest that the semi CRF output layer, while still quadratic in complexity, is a simple, fast and wellperforming solution for event-based prediction, and may lead to similar success in other areas which currently rely on frame-level estimates.
Researcher Affiliation	Academia	Yujia Yan , Frank Cwitkowitz, Zhiyao Duan Department of Electrical and Computer Engineering University of Rochester Rochester, NY, 14627 USA yujia.yan@rochester.edu, fcwitkow@ur.rochester.edu zhiyao.duan@rochester.edu
Pseudocode	Yes	Algorithm 1 Forward-backward algorithm for log Z and log Z for a speciﬁc event type. Algorithm 2 Viterbi (MAP) decoding of a speciﬁc event Type within an audio segment.
Open Source Code	Yes	Code is available at https://github.com/Yujia-Yan/Skipping-The-Frame-Level
Open Datasets	Yes	We conduct our experiments using the MAESTRO v2 dataset [Hawthorne et al., 2019], which contains around 200 hours of MIDI-synchronized (3ms precision) virtuoso piano performance recordings.
Dataset Splits	Yes	We conduct our experiments using the MAESTRO v2 dataset [Hawthorne et al., 2019]... We compare the proposed system to the state-of-the-art methods2 , for piano transcription using the MAESTRO v2 test split. We recompute these metrics for other systems directly from the transcribed MIDI ﬁles generated by their pretrained models. We also report our results for our model trained and evaluated on the MAESTRO v3 splits for future reference.
Hardware Specification	Yes	Running time (seconds) of algorithm components that have quadratic time complexity w.r.t. the input length on Intel(R) Core(TM) i7-7800X CPU @ 3.50 GHz and Nvidia GTX 1080TI.
Software Dependencies	No	The algorithms were implemented in Py Torch, and we believe that further speedup can be achieved with a native C++/CUDA implementation. However, no specific version number for PyTorch or other software dependencies is provided.
Experiment Setup	Yes	We use a batch size of 12 and Adabelief [Zhuang et al., 2020] optimizer with a weight decay of 1e-4. We use one Cycle [Smith and Topin, 2019] learning rate scheduler with maximum learning rate set to 6e-4 for 180k iterations and cosine annealing. The learning rate is increased gradually for 20% of iterations and then gradually annealed to 1.5e-5. We automatically determine the value for gradient clipping by using the 0.8 quantile of the gradient norm during the last 10k iterations, which is a strategy similar to Seetharaman et al. [2020]. We apply dropout with rate 0.1 on the attribute predictors and the score model.