Neural Ordinary Differential Equations

Authors: Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David K. Duvenaud

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we experimentally investigate the training of neural ODEs for supervised learning. ... We investigate the ability of the latent ODE model to fit and extrapolate time series. ... Table 1: Performance on MNIST. ... Table 2: Predictive RMSE on test set
Researcher Affiliation Academia Ricky T. Q. Chen*, Yulia Rubanova*, Jesse Bettencourt*, David Duvenaud University of Toronto, Vector Institute
Pseudocode Yes Algorithm 1 Reverse-mode derivative of an ODE initial value problem
Open Source Code No We have since released a Py Torch (Paszke et al., 2017) implementation, including GPU-based implementations of several standard ODE solvers at .
Open Datasets Yes Table 1: Performance on MNIST. From Le Cun et al. (1998).
Dataset Splits No For the MNIST dataset, no specific train/validation/test split percentages or counts are provided. For the spiral dataset, it states "randomly sample points from each trajectory without replacement (n = {30, 50, 100})" and "100 time points extending beyond those that were used for training" (implying training and test), but a separate validation split is not explicitly mentioned.
Hardware Specification No The paper does not provide specific details about the hardware used, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions software like LSODE, VODE, Python's Autograd framework, TensorFlow, and PyTorch, but it does not provide specific version numbers for these dependencies.
Experiment Setup Yes The recognition network is an RNN with 25 hidden units. We use a 4-dimensional latent space. We parameterize the dynamics function f with a one-hidden-layer network with 20 hidden units. The decoder computing p(xti|zti) is another neural network with one hidden layer with 20 hidden units. ... For this task, we minimize KL (q(x) p(x)) as the loss function... and train for 10,000 iterations using Adam (Kingma and Ba, 2014). In contrast, the NF is trained for 500,000 iterations using RMSprop (Hinton et al., 2012).