Go with the flow: Adaptive control for Neural ODEs

Authors: Mathieu Chalvidal, Matthew Ricci, Rufin VanRullen, Thomas Serre

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide theoretical and empirical evidence that N-CODE circumvents limitations of previous NODEs models and show how increased model expressivity manifests in several supervised and unsupervised learning problems. These favorable empirical results indicate the potential of using dataand activity-dependent plasticity in neural networks across numerous domains.
Researcher Affiliation Academia Mathieu Chalvidal1,2,3, Matthew Ricci4, Rufin Van Rullen1,3, Thomas Serre1,2 1Artificial and Natural Intelligence Toulouse Institute, Universite de Toulouse, France 2Carney Institute for Brain Science, Dpt. of Cognitive Linguistic & Psychological Sciences Brown University, Providence, RI 02912 3Centre de Recherche Cerveau & Cognition CNRS, Universit e de Toulouse 4Data Science Initiative, Brown University, Providence, RI 02912 {mathieu chalvid, mgr, thomas serre}@brown.edu rufin.vanrullen@cnrs.fr
Pseudocode Yes We provide here a commented generic Py Torch implementation for the N-CODE module in the open-loop setting. class NCODE_func(torch.nn.Module):... class NCODE_block(torch.nn.Module):...
Open Source Code No The paper provides code snippets within the appendix as examples but does not provide a link to a full public code repository or explicitly state that the source code for the methodology is being released.
Open Datasets Yes We perform this experiment on the MNIST and CIFAR-10 image datasets. For MNIST and CIFAR-10 classification... latent space dimension of 25 for MNIST, 128 for CIFAR-10 and 64 for Celeb A. All convolutions and transposed convolutions have a filter size of 4 4 for MNIST and CIFAR-10 and 5 5 for CELEBA.
Dataset Splits Yes We perform this experiment on the MNIST and CIFAR-10 image datasets. Our model consists of the dynamical system... Table 1: Classification accuracy on MNIST and CIFAR-10 test sets... Official train and test splits are used for the three datasets.
Hardware Specification Yes Our experiments were run on a 12GB NVIDIA Titan Xp GPUs cluster equipped with CUDA 10.1 driver. We acknowledge the Cloud TPU hardware resources that Google made available via the Tensor Flow Research Cloud (TFRC) program...
Software Dependencies No Neural ODEs were trained using the torchdiffeq (Chen et al., 2018) Py Torch package. In practice, we compute the Jacobian for this augmented dynamics with open source automatic differentiation libraries using Pytorch (Paszke et al., 2019), enabling seamless integration of NCODE modules in bigger architectures. The CUDA 10.1 driver is mentioned but specific software library versions (e.g., PyTorch version) are not.
Experiment Setup Yes Our model consists of the dynamical system with equation of motion f expressed as a single convolutional block (1 1 7 3 3 7 1 1) with 50 channels... For all models we used the adaptative solver Dormund-Prince with tolerance of 1e-3. For each model, we learn the model weights with gradient descent using an Adam optimizer with a learning rate λ = 3e 4... latent space dimension of 25 for MNIST, 128 for CIFAR-10 and 64 for Celeb A. All convolutions and transposed convolutions have a filter size of 4 4 for MNIST and CIFAR-10 and 5 5 for CELEBA. We apply batch normalization to all layers... For training, we use a mini-batch size of 64 in MNIST and CIFAR and 16 for Celeb A... All models are trained for a maximum of 50 epochs on MNIST and CIFAR and 40 epochs on Celeb A. Gradient descent is performed for 50 epochs with the Adam optimizer (Kingma & Ba, 2014) with learning rate λ = 1e 3 reduced by half every time the loss plateaus.