Hypersolvers: Toward Fast Continuous-Depth Models
Authors: Michael Poli, Stefano Massaroli, Atsushi Yamashita, Hajime Asama, Jinkyoo Park
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental evaluations on standard benchmarks, such as sampling for continuous normalizing flows, reveal consistent pareto efficiency over classical numerical methods. |
| Researcher Affiliation | Academia | Michael Poli KAIST, Diff Eq ML poli_m@kaist.ac.kr Stefano Massaroli The University of Tokyo, Diff Eq ML massaroli@robot.t.u-tokyo.ac.jp Atsushi Yamashita The University of Tokyo yamashita@robot.t.u-tokyo.ac.jp Hajime Asama The University of Tokyo asama@robot.t.u-tokyo.ac.jp Jinkyoo Park KAIST jinkyoo.park@kaist.ac.kr |
| Pseudocode | No | The paper provides mathematical formulations and equations but does not include structured pseudocode or an algorithm block. |
| Open Source Code | Yes | Supporting reproducibility code is at https://github.com/Diff Eq ML/diffeqml-research/tree/master/hypersolver |
| Open Datasets | Yes | We train standard convolutional Neural ODEs with input layer augmentation (Massaroli et al., 2020b) on MNIST and CIFAR10 datasets. |
| Dataset Splits | No | The paper mentions using 'training dataset' and 'test data' but does not explicitly specify a validation set or detailed split percentages (e.g., 80/10/10) needed for reproduction. |
| Hardware Specification | Yes | The measurements presented are collected on a single V100 GPU. |
| Software Dependencies | No | The paper mentions 'Torch Dyn (Poli et al., 2020) library' and 'Py Torch (Paszke et al., 2017) module implementation' but does not specify their version numbers. |
| Experiment Setup | Yes | Following this initial optimization step, 2 layer convolutional Euler hypersolvers, Hyper Euler, (4) are trained by residual fitting (6) on 10 epochs of the training dataset with solution mesh length set to K = 10. As ground truth labels, we utilize the solutions obtained via dopri5 with absolute and relative tolerances set to 10 4 on the same data. |