Neural Ordinary Differential Equations
Authors: Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David K. Duvenaud
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we experimentally investigate the training of neural ODEs for supervised learning. ... We investigate the ability of the latent ODE model to fit and extrapolate time series. ... Table 1: Performance on MNIST. ... Table 2: Predictive RMSE on test set |
| Researcher Affiliation | Academia | Ricky T. Q. Chen*, Yulia Rubanova*, Jesse Bettencourt*, David Duvenaud University of Toronto, Vector Institute |
| Pseudocode | Yes | Algorithm 1 Reverse-mode derivative of an ODE initial value problem |
| Open Source Code | No | We have since released a Py Torch (Paszke et al., 2017) implementation, including GPU-based implementations of several standard ODE solvers at . |
| Open Datasets | Yes | Table 1: Performance on MNIST. From Le Cun et al. (1998). |
| Dataset Splits | No | For the MNIST dataset, no specific train/validation/test split percentages or counts are provided. For the spiral dataset, it states "randomly sample points from each trajectory without replacement (n = {30, 50, 100})" and "100 time points extending beyond those that were used for training" (implying training and test), but a separate validation split is not explicitly mentioned. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions software like LSODE, VODE, Python's Autograd framework, TensorFlow, and PyTorch, but it does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | The recognition network is an RNN with 25 hidden units. We use a 4-dimensional latent space. We parameterize the dynamics function f with a one-hidden-layer network with 20 hidden units. The decoder computing p(xti|zti) is another neural network with one hidden layer with 20 hidden units. ... For this task, we minimize KL (q(x) p(x)) as the loss function... and train for 10,000 iterations using Adam (Kingma and Ba, 2014). In contrast, the NF is trained for 500,000 iterations using RMSprop (Hinton et al., 2012). |