"Hey, that’s not an ODE": Faster ODE Adjoints via Seminorms
Authors: Patrick Kidger, Ricky T. Q. Chen, Terry J Lyons
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on a wide range of tasks including time series, generative modeling, and physical control demonstrate a median improvement of 40% fewer function evaluations. On some problems we see as much as 62% fewer function evaluations, so that the overall training time is roughly halved. 3. Experiments We compare our proposed technique against conventionally-trained neural differential equations, across multiple tasks time series, generative, and physics-informed. |
| Researcher Affiliation | Academia | 1Mathematical Institute, University of Oxford, Oxford, United Kingdom 2The Alan Turing Institute, The British Library, London, United Kingdom 3University of Toronto 4The Vector Institute. Correspondence to: Patrick Kidger <kidger@maths.ox.ac.uk>. |
| Pseudocode | No | The paper does not contain a structured pseudocode block or an algorithm section. |
| Open Source Code | Yes | Code torchdiffeq (Chen et al., 2018) now supports adjoint seminorms as a built-in option. This may be used by passing odeint adjoint(..., adjoint options=dict(norm="seminorm")). The code for all our experiments can be found at https://github.com/patrick-kidger/ Faster Neural Diff Eq/. |
| Open Datasets | Yes | We apply a Neural CDE to the Speech Commands dataset (Warden, 2020). In Table 2 we show the final test performance and the total number of function evaluations (NFEs) used in the adjoint method over 100 epochs. We see substantially fewer NFEs in experiments on both MNIST and CIFAR-10. |
| Dataset Splits | No | The paper does not explicitly provide training/validation/test dataset splits with percentages or specific counts for reproducibility. It refers to 'test set accuracy' but not the split methodology. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | Code torchdiffeq (Chen et al., 2018) now supports adjoint seminorms as a built-in option. This may be used by passing odeint adjoint(..., adjoint options=dict(norm="seminorm")). The code for all our experiments can be found at https://github.com/patrick-kidger/ Faster Neural Diff Eq/. We used the torchcde package (Kidger, 2020), which wraps torchdiffeq. |
| Experiment Setup | Yes | We investigate how the effect changes for varying tolerances by varying the pair (RTOL, ATOL) over (10 3, 10 6), (10 4, 10 7), and (10 5, 10 8). In each case see Appendix B for details on hyperparameters, optimisers and so on. |