LyaNet: A Lyapunov Framework for Training Neural ODEs
Authors: Ivan Dario Jimenez Rodriguez, Aaron Ames, Yisong Yue
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Relative to standard Neural ODE training, we empirically find that Lya Net can offer improved prediction performance, faster convergence of inference dynamics, and improved adversarial robustness. |
| Researcher Affiliation | Collaboration | 1Department of Computational and Mathematical Sciences, California Institute of Technology 2Argo AI. |
| Pseudocode | Yes | Algorithm 1 Monte Carlo Lya Net Training Algorithm 2 Path Integral Lya Net Training |
| Open Source Code | Yes | Our code is available at https://github. com/ivandariojr/Lyapunov Learning. |
| Open Datasets | Yes | We evaluate primarily on three computer vision datasets: Fashion MNIST, CIFAR-10 and CIFAR-100. |
| Dataset Splits | Yes | We found this by performing a grid search on learning rates and batch sizes over (0.1, 0.001, 0.001) (32, 64, 128), validated on a held out set of 10% of training data. |
| Hardware Specification | Yes | Our experiments ran on a cluster 6 GPUs: 4 Ge Force 1080 GPUs, 1 Titan X and Titan RTX. All experiments were able to run on less than 10GB of VRAM. |
| Software Dependencies | No | The paper mentions 'Nero (Liu et al., 2021)' and 'PGD as implemented by Kim (2020)' but does not specify version numbers for these software components or libraries. |
| Experiment Setup | Yes | To simplify tuning, we trained our models using Nero (Liu et al., 2021) with a learning rate of 0.01 with a batch size of 64 for models trained with Lya Net and 128 for models trained with regular backpropagation. We found this by performing a grid search on learning rates and batch sizes over (0.1, 0.001, 0.001) (32, 64, 128), validated on a held out set of 10% of training data. All models were trained for a total of 120 epochs. For our adversarial attack we used PGD as implemented by Kim (2020) for 10 iterations with a step size α = 2 255. |