Improving Neural ODE Training with Temporal Adaptive Batch Normalization
Authors: Su Zheng, Zhengqi Gao, Fan-Keng Sun, Duane Boning, Bei Yu, Martin D. Wong
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive numerical experiments on image classification and physical system modeling substantiate the superiority of TA-BN compared to baseline methods. |
| Researcher Affiliation | Academia | Su Zheng1 , Zhengqi Gao2 , Fan-Keng Sun2, Duane S. Boning2, Bei Yu1, Martin Wong1 1Department of CSE, CUHK 2 Department of EECS, MIT |
| Pseudocode | Yes | Algorithm 1 The forward pass of a TA-BN layer at time tj |
| Open Source Code | Yes | We put part of the code for reproduciblity in supplementary. It will be released upon acceptance. |
| Open Datasets | Yes | We conduct image classification across datasets including MNIST [26], SVHN [33], CIFAR-10, CIFAR-100 [22], and Tiny-Image Net [24]. |
| Dataset Splits | No | The paper explicitly states a 90% training and remaining testing split for the Charge Pump circuit modeling dataset, but does not provide details for a validation split for any of the datasets used. |
| Hardware Specification | Yes | All experiments are run on a Linux server with RTX 3090 GPUs. |
| Software Dependencies | No | The paper mentions software like 'Py Torch' and 'Torch Diffeq' but does not specify their version numbers. |
| Experiment Setup | Yes | We employ the dopri5 solver with a tolerance of 10 3 for ODE solving and adopt the Adam W optimizer [27] with a learning rate of 10 3 to train the neural networks for 128 epochs. The training batch size is 256. We set M = 100 for TA-BN. |