reproducibilityindex.ai

Learning to Groove with Inverse Sequence Transformations

Authors: Jon Gillick, Adam Roberts, Jesse Engel, Douglas Eck, David Bamman

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We explore models for translating abstract musical ideas (scores, rhythms) into expressive performances using Seq2Seq and recurrent variational Information Bottleneck (VIB) models. Though Seq2Seq models usually require painstakingly aligned corpora, we show that it is possible to adapt an approach from the Generative Adversarial Network (GAN) literature (e.g., Pix2Pix (Isola et al., 2017) and Vid2Vid (Wang et al., 2018a)) to sequences, creating large volumes of paired data by performing simple transformations and training generative models to plausibly invert these transformations. Music, and drumming in particular, provides a strong test case for this approach because many common transformations (quantization, removing voices) have clear semantics, and models for learning to invert them have real-world applications. Focusing on the case of drum set players, we create and release a new dataset for this purpose, containing over 13 hours of recordings by professional drummers aligned with ﬁnegrained timing and dynamics information. We also explore some of the creative potential of these models, including demonstrating improvements on state-of-the-art methods for Humanization (instantiating a performance from a musical score).
Researcher Affiliation	Collaboration	1School of Information, University of California, Berkeley, CA, U.S.A 2Google AI, Mountain View, CA, U.S.A.
Pseudocode	No	The paper includes architectural diagrams (Figure 3 and Figure 4) but no explicit pseudocode blocks or algorithms.
Open Source Code	Yes	Code, data, trained models, and audio examples are available at https://g.co/magenta/groovae.
Open Datasets	Yes	The dataset, which we refer to as the Groove MIDI Dataset (GMD), is publicly available for download at https://magenta.tensorflow.org/datasets/groove.
Dataset Splits	Yes	After partitioning recorded sequences into training, development, and test sets, we slide ﬁxed size windows across all full sequences to create drum patterns of ﬁxed length... A train/validation/test split conﬁguration is provided for easier comparison of model accuracy on various tasks.
Hardware Specification	No	The paper mentions using a "Roland TD-11 electronic drum kit" for data collection, but does not specify any hardware (CPU, GPU, memory) used for training or running the experiments.
Software Dependencies	No	We train all our neural models with Tensorﬂow (Abadi et al., 2016) and the Adam optimizer (Kingma & Ba, 2014). The paper mentions TensorFlow and Adam optimizer but does not specify their version numbers.
Experiment Setup	No	The paper does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations for the neural models. It mentions "a single hidden layer of size 256 and Re LU nonlinearities" for MLP and "LSTM layer dimensions from 2048 to 512 and the dimension of z from 512 to 256" for Seq2Seq, but these are architectural details, not specific training setup parameters.