Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders

Authors: Jesse Engel, Cinjon Resnick, Adam Roberts, Sander Dieleman, Mohammad Norouzi, Douglas Eck, Karen Simonyan

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using NSynth, we demonstrate improved qualitative and quantitative performance of the Wave Net autoencoder over a well-tuned spectral autoencoder baseline.
Researcher Affiliation Industry 1Google Brain 2Deep Mind. Correspondence to: Jesse Engel <jesseengel@google.com>.
Pseudocode No The paper describes the models and architectures textually and through diagrams (Figure 1), but does not include any explicit pseudocode or algorithm blocks.
Open Source Code No The paper states that the NSynth dataset is publicly available, but it does not provide a link or statement about the open-source code for the Wave Net autoencoder model itself.
Open Datasets Yes The full NSynth dataset will be made publicly available in a serialized format after publication. is available for download at http://download.magenta.tensorflow.org/hans as TFRecord files split into training and holdout sets.
Dataset Splits No The paper mentions that the dataset is split into "training and holdout sets" but does not specify a separate validation split or its proportions.
Hardware Specification No The paper describes training parameters but does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for the experiments.
Software Dependencies No The paper mentions using an "Adam optimizer" and "Tensor Flow Example protocol buffer", but it does not provide specific version numbers for any software dependencies.
Experiment Setup Yes The baseline models commonly use a learning rate of 1e-4, while the Wave Net models use a schedule, starting at 2e-4 and descending to 6e-5, 2e-5, and 6e-6 at iterations 120k, 180k, and 240k respectively. The baseline models train asynchronously for 1800k iterations with a batch size of 8. The Wave Net models train synchronously for 250k iterations with a batch size of 32.