Differentiable Dynamic Programming for Structured Prediction and Attention
Authors: Arthur Mensch, Mathieu Blondel
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We showcase these instantiations on structured prediction (audio-to-score alignment, NER) and on structured and sparse attention for translation. ... We measure the performance of the different losses and regularizations on the four languages of the Co NLL 2003 dataset. Results are reported in Table 1, along with reference results with different pretrained embeddings. ... We perform a leave-one-out cross-validation of our model performance, learning the multinomial classifier on 9 pieces and assessing the quality of the alignment on the remaining piece. ... Experiments. We demonstrate structured attention layers with an LSTM encoder and decoder to perform French to English translation... |
| Researcher Affiliation | Collaboration | 1Inria, CEA, Universit e Paris-Saclay, Gif-sur-Yvette, France. Work performed at 2NTT Communication Science Laboratories, Kyoto, Japan. |
| Pseudocode | Yes | Pseudo-code is summarized in A.5. ... Pseudo-code for VitΩ(θ), as well as gradient and Hessian-product computations, is provided in B.2. ... Pseudo-code to compute DTWΩ(θ) as well as its gradient and its Hessian products are provided in B.3. |
| Open Source Code | Yes | We have released an optimized and modular Py Torch implementation for reproduction and reuse. |
| Open Datasets | Yes | We measure the performance of the different losses and regularizations on the four languages of the Co NLL 2003 dataset. ... We use our framework to perform supervised audio-to-score alignment on the Bach 10 dataset (Duan & Pardo, 2011). |
| Dataset Splits | Yes | We perform a leave-one-out cross-validation of our model performance, learning the multinomial classifier on 9 pieces and assessing the quality of the alignment on the remaining piece. |
| Hardware Specification | No | AM thanks Julien Mairal, Inria Thoth and Inria Parietal for lending him the computational resources necessary to run the experiments. However, no specific hardware details (e.g., GPU/CPU models, memory) are provided. |
| Software Dependencies | No | We have released an optimized and modular Py Torch implementation for reproduction and reuse. However, a specific version number for PyTorch or other software dependencies is not provided. |
| Experiment Setup | Yes | Architecture details are provided in C.1. ... We set the cost between an audio frame and a key to be the loglikelihood of this key given a multinomial linear classifier: i [NA], li log(softmax(W ai + c)) RK and j [NB], θi,j li,bj, where (W , c) RD K RK are learned classifier parameters. ... We demonstrate structured attention layers with an LSTM encoder and decoder... |