Soft-DTW: a Differentiable Loss Function for Time-Series

Authors: Marco Cuturi, Mathieu Blondel

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We close this paper with experimental results in 4 that showcase each of these potential applications. Throughout this section, we use the UCR (University of California, Riverside) time series classification archive (Chen et al., 2015).
Researcher Affiliation Collaboration 1CREST, ENSAE, Universit e Paris-Saclay, France 2NTT Communication Science Laboratories, Seika-cho, Kyoto, Japan.
Pseudocode Yes Algorithm 1 Forward recursion to compute dtwγ(x, y) and intermediate alignment costs, Algorithm 2 Backward recursion to compute x dtwγ(x, y)
Open Source Code Yes Source code is available at https://github.com/mblondel/soft-dtw.
Open Datasets Yes Throughout this section, we use the UCR (University of California, Riverside) time series classification archive (Chen et al., 2015).
Dataset Splits Yes We use 50% of the data for training, 25% for validation and 25% for testing.
Hardware Specification No The paper does not provide specific details about the hardware used for experiments, such as GPU or CPU models. It only mentions using 'a machine' or training MLPs without hardware specifications.
Software Dependencies No We implemented a custom backward pass in Chainer, which can then be used to plug soft-DTW as a loss function in any network architecture. To estimate the MLP s parameters, we used Chainer s implementation of Adam (Kingma & Ba, 2014). (No specific version numbers for Chainer are provided).
Experiment Setup Yes For each method, we set the maximum number of iterations to 100. To minimize the proposed soft-DTW barycenter objective, Eq. (4), we use L-BFGS. We set the maximum number of outer iterations to 30 and the maximum number of inner (barycenter) iterations to 100, as before. Again, for soft-DTW, we use L-BFGS. We choose γ from 15 log-spaced values between 10−3 and 10. To estimate the MLP s parameters, we used Chainer s implementation of Adam (Kingma & Ba, 2014).