NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations
Authors: Marco Ciccone, Marco Gallieri, Jonathan Masci, Christian Osendorfer, Faustino Gomez
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show how NAIS-Net exhibits stability in practice, yielding a significant reduction in generalization gap compared to Res Nets. [...] These implementations are compared experimentally with Res Nets on both CIFAR-10 and CIFAR-100 datasets, in section 5, showing that NAIS-Nets achieve comparable classification accuracy with a much better generalization gap. |
| Researcher Affiliation | Collaboration | Marco Ciccone Politecnico di Milano NNAISENSE SA marco.ciccone@polimi.it Marco Gallieri NNAISENSE SA marco@nnaisense.com Jonathan Masci NNAISENSE SA jonathan@nnaisense.com Christian Osendorfer NNAISENSE SA christian@nnaisense.com Faustino Gomez NNAISENSE SA tino@nnaisense.com |
| Pseudocode | Yes | Algorithm 1 Fully Connected Reprojection and Algorithm 2 CNN Reprojection |
| Open Source Code | No | The paper does not provide an explicit statement or a link to open-source code for the described methodology. |
| Open Datasets | Yes | Experiments were conducted comparing NAIS-Net with Res Net, and variants thereof, using both fully-connected (MNIST, section 5.1) and convolutional (CIFAR-10/100, section 5.2) architectures... [27] A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images. 2009. [32] Yann Le Cun. The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/, 1998. |
| Dataset Splits | Yes | Experiments were conducted comparing NAIS-Net with Res Net, and variants thereof, using both fully-connected (MNIST, section 5.1) and convolutional (CIFAR-10/100, section 5.2) architectures... For the MNIST dataset [32] a single-block NAIS-Net was compared... These benchmarks are simple enough to allow for multiple runs to test for statistical significance, yet sufficiently complex to require convolutional layers. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU models, CPU types, or memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'tensorflow/models' in a footnote, but it does not specify version numbers for TensorFlow or any other software libraries or dependencies used in their implementation. |
| Experiment Setup | Yes | All networks were trained using stochastic gradient descent with momentum 0.9 and learning rate 0.1, for 150 epochs. [...] The initial learning rate of 0.1 was decreased by a factor of 10 at epochs 150, 250 and 350 and the experiment were run for 450 epochs. |