DDSP: Differentiable Digital Signal Processing

Authors: Jesse Engel, Lamtharn (Hanoi) Hantrakul, Chenjie Gu, Adam Roberts

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For empirical verification of this approach, we test two DDSP autoencoder variants supervised and unsupervised on two different musical datasets: NSynth (Engel et al., 2017) and a collection of solo violin performances.
Researcher Affiliation Industry Jesse Engel, Lamtharn Hantrakul, Chenjie Gu, & Adam Roberts Google Research, Brain Team Mountain View, CA 94043, USA {jesseengel,hanoih,gcj,adarob}@google.com
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes The library is publicly available1 and we welcome further contributions from the community and domain experts. 1Online Resources: Code: https://github.com/magenta/ddsp
Open Datasets Yes NSynth: We focus on a smaller subset of the NSynth dataset (Engel et al., 2017) consistent with other work (Engel et al., 2019; Hantrakul et al., 2019). Using the Mus Open royalty free music library, we collected 13 minutes of expressive, solo violin performances4. We purposefully selected pieces from a single performer (John Garner), that were monophonic and shared a consistent room environment to encourage the model to focus on performance. Like NSynth, audio is converted to mono 16k Hz and divided into 4 second training examples (64000 samples total). Code to process the audio files into a dataset is available online.5
Dataset Splits No We employ a 80/20 train/test split shuffling across instrument families. - The paper mentions a train/test split but does not explicitly provide details for a validation split.
Hardware Specification No We express core components as feedforward functions, allowing efficient implementation on parallel hardware such as GPUs and TPUs, and generation of samples during training. - The paper mentions general hardware types (GPUs, TPUs) but does not provide specific models or configurations used for its experiments.
Software Dependencies No The paper mentions software like TensorFlow and CREPE, but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes We used ADAM optimizer with learning rate 0.001 and exponential learning rate decay 0.98 every 10,000 steps. In our experiments, we used FFT sizes (2048, 1024, 512, 256, 128, 64), and the neighboring frames in the Short-Time Fourier Transform (STFT) overlap by 75%.