Discovering Neural Wirings
Authors: Mitchell Wortsman, Ali Farhadi, Mohammad Rastegari
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that our learned connectivity outperforms hand engineered and randomly wired networks. By learning the connectivity of Mobile Net V1 [12] we boost the Image Net accuracy by 10% at 41M FLOPs. Moreover, we show that our method generalizes to recurrent and continuous time networks. Our work may also be regarded as unifying core aspects of the neural architecture search problem with sparse neural network learning. |
| Researcher Affiliation | Collaboration | Mitchell Wortsman1,2, Ali Farhadi1,2,3, Mohammad Rastegari1,3 1PRIOR @ Allen Institute for AI, 2University of Washington, 3XNOR.AI |
| Pseudocode | Yes | Algorithm 1 DNW-Train(V, V0, VE, gφ, h , {f v}v2V, pxy, k, L) |
| Open Source Code | Yes | Code and pretrained models are available at https://github.com/allenai/dnw while additional visualizations may be found at https://mitchellnw.github.io/blog/2019/dnw/. |
| Open Datasets | Yes | Table 1: Testing a tiny (41k parameters) classifier on CIFAR-10 [16] in static and dynamic settings shown as mean and standard deviation (std) over 5 runs. [...] For large scale experiments on Image Net [5] we are limited to exploring DNW in the static case... |
| Dataset Splits | Yes | Table 1: Testing a tiny (41k parameters) classifier on CIFAR-10 [16] in static and dynamic settings shown as mean and standard deviation (std) over 5 runs. [...] For large scale experiments on Image Net [5] we are limited to exploring DNW in the static case... |
| Hardware Specification | No | The paper mentions 'Computations on beaker.org were supported in part by credits from Google Cloud.' but does not specify any exact GPU/CPU models, processor types, or memory details used for the experiments. |
| Software Dependencies | No | The paper mentions software like 'PyTorch' in its references but does not explicitly state specific software dependencies with their version numbers (e.g., 'PyTorch 1.9' or 'CUDA 11.1') in the main text related to the experimental setup. |
| Experiment Setup | Yes | In each experiment we train for 250 epochs using Cosine Annealing as the learning rate scheduler with initial learning rate 0.1, as in [35]. [...] We train for 160 epochs using Cosine Annealing as the learning rate scheduler with initial learning rate = 0.1 unless otherwise noted. |