FIGARO: Controllable Music Generation using Learned and Expert Features

Authors: Dimitri von Rütte, Luca Biggio, Yannic Kilcher, Thomas Hofmann

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate FIGARO on its ability to adhere to the prescribed condition by comparing it to state-of-the-art methods for controllable symbolic music generation (Choi et al., 2020; Wu & Yang, 2021). We demonstrate empirically that our technique outperforms the state-of-the-art in controllable generation and sample quality.
Researcher Affiliation Academia Dimitri von R utte , Luca Biggio , Yannic Kilcher, Thomas Hofmann Department of Computer Science, ETH Z urich dimitri.vonrutte@inf.ethz.ch
Pseudocode Yes Algorithm 1 Expert Description
Open Source Code Yes We also release the source code and model weights for anyone to download and use freely.2 Our secondary contribution is REMI+, an extension to the REMI input representation (Huang & Yang, 2020) which opens the way to multi-track, multi-time-signature music. 2Code and model weights are available through Git Hub (https://github.com/dvruette/figaro).
Open Datasets Yes We use the Lakh MIDI dataset (Raffel, 2016) as training data in all of our experiments, which to the best of our knowledge is the largest publicly available symbolic music dataset.
Dataset Splits Yes We use a 80%-10%-10% training-validation-test split.
Hardware Specification Yes Each model is trained for 24 hours on 4 Nvidia GTX 2080 Ti GPUs.
Software Dependencies No The paper mentions software components like "Adam optimizer" and implies the use of deep learning frameworks, but it does not specify version numbers for key software dependencies like Python or PyTorch.
Experiment Setup Yes We train each model for 100k steps with a batch size of 512 sequences. Models are optimized using the Adam optimizer (Kingma & Ba, 2017) with β1 = 0.9, β2 = 0.999, ϵ = 10 6 and 0.01 weight decay. We use the inverse-square-root learning rate schedule with initial constant warmup at 10 4 given by 10 4/ max(1, p n/N) where N = 4000 is the number of warmup steps.