STEER : Simple Temporal Regularization For Neural ODE
Authors: Arnab Ghosh, Harkirat Behl, Emilien Dupont, Philip Torr, Vinay Namboodiri
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show through experiments on normalizing flows, time series models and image recognition that the proposed regularization can significantly decrease training time and even improve performance over baseline models. |
| Researcher Affiliation | Academia | Arnab Ghosh University of Oxford arnabg@robots.ox.ac.uk Harkirat Singh Behl University of Oxford harkirat@robots.ox.ac.uk Emilien Dupont University of Oxford emilien.dupont@stats.ox.ac.uk Philip H. S. Torr University of Oxford phst@robots.ox.ac.uk Vinay Namboodiri University of Bath vpn22@bath.ac.uk |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this paper. The only link found is to a third-party Neural ODE example (rtqichen/torchdiffeq) which is not the authors' own implementation code for STEER. |
| Open Datasets | Yes | We focus on the task of density estimation of image datasets using the same architectures and experimental setup as FFJORD [18]. More details are provided in the supplementary material. The performance of the model is evaluated on multiple datasets, namely MNIST [32], CIFAR-10 [30] and Image Net-32 [10]... The Human Activity dataset [46]... Physionet [48]... The Mujoco experiments... |
| Dataset Splits | Yes | We focus on the task of density estimation of image datasets using the same architectures and experimental setup as FFJORD [18]. More details are provided in the supplementary material. ... We use the same architecture and experiment settings as used in Latent ODEs [46]. ... We use the same architectures and experimental settings as used in [14] for both Neural ODEs (NODE) and Augmented Neural ODEs (ANODE). |
| Hardware Specification | No | The paper does not explicitly describe the hardware used to run its experiments, such as specific GPU or CPU models. |
| Software Dependencies | No | The paper mentions 'torchdiffeq' and 'dormandprince ODE solver' but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | For the Mujoco experiment 15 and 30 latent dimensions were used for the generative and recognition models respectively. The ODE function had 500 units in each of the 3 layers. A 15 dimensional latent state was used for the Human Activity dataset. For the Physionet experiments the ODE function has 3 layers with 50 units each. The classifier consisted of 2 layers with 300 units each. ... For the MNIST experiments, 92 filters were used for NODEs while 64 filters were used for ANODEs to compensate for 5 augmented dimensions. For the CIFAR-10 and SVHN experiments, 125 filters were used for NODEs while 64 filters were used for ANODEs with 10 augmented dimensions. ... STEER regularization with b = 0.99. ... with STEER regularization with b = 0.124 |