STEER : Simple Temporal Regularization For Neural ODE

Authors: Arnab Ghosh, Harkirat Behl, Emilien Dupont, Philip Torr, Vinay Namboodiri

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show through experiments on normalizing flows, time series models and image recognition that the proposed regularization can significantly decrease training time and even improve performance over baseline models.
Researcher Affiliation Academia Arnab Ghosh University of Oxford arnabg@robots.ox.ac.uk Harkirat Singh Behl University of Oxford harkirat@robots.ox.ac.uk Emilien Dupont University of Oxford emilien.dupont@stats.ox.ac.uk Philip H. S. Torr University of Oxford phst@robots.ox.ac.uk Vinay Namboodiri University of Bath vpn22@bath.ac.uk
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described in this paper. The only link found is to a third-party Neural ODE example (rtqichen/torchdiffeq) which is not the authors' own implementation code for STEER.
Open Datasets Yes We focus on the task of density estimation of image datasets using the same architectures and experimental setup as FFJORD [18]. More details are provided in the supplementary material. The performance of the model is evaluated on multiple datasets, namely MNIST [32], CIFAR-10 [30] and Image Net-32 [10]... The Human Activity dataset [46]... Physionet [48]... The Mujoco experiments...
Dataset Splits Yes We focus on the task of density estimation of image datasets using the same architectures and experimental setup as FFJORD [18]. More details are provided in the supplementary material. ... We use the same architecture and experiment settings as used in Latent ODEs [46]. ... We use the same architectures and experimental settings as used in [14] for both Neural ODEs (NODE) and Augmented Neural ODEs (ANODE).
Hardware Specification No The paper does not explicitly describe the hardware used to run its experiments, such as specific GPU or CPU models.
Software Dependencies No The paper mentions 'torchdiffeq' and 'dormandprince ODE solver' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes For the Mujoco experiment 15 and 30 latent dimensions were used for the generative and recognition models respectively. The ODE function had 500 units in each of the 3 layers. A 15 dimensional latent state was used for the Human Activity dataset. For the Physionet experiments the ODE function has 3 layers with 50 units each. The classifier consisted of 2 layers with 300 units each. ... For the MNIST experiments, 92 filters were used for NODEs while 64 filters were used for ANODEs to compensate for 5 augmented dimensions. For the CIFAR-10 and SVHN experiments, 125 filters were used for NODEs while 64 filters were used for ANODEs with 10 augmented dimensions. ... STEER regularization with b = 0.99. ... with STEER regularization with b = 0.124