reproducibilityindex.ai

Neural Parameter Allocation Search

Authors: Bryan A. Plummer, Nikoli Dryden, Julius Frost, Torsten Hoefler, Kate Saenko

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments include a wide variety of tasks and networks in order to demonstrate the broad applicability of NPAS and SSNs. We benchmark SSNs for LBand HB-NPAS and show they create high-performing networks when either using few parameters or adding network capacity.
Researcher Affiliation	Collaboration	Boston University, ETH Z urich, MIT-IBM Watson AI Lab
Pseudocode	No	No explicit pseudocode or algorithm blocks were found.
Open Source Code	Yes	To further aid reproducibility, we publicly release our SSN code at https://github.com/Bryan Plummer/SSN.
Open Datasets	Yes	We evaluate SSNs on CIFAR-10 and CIFAR100 (Krizhevsky, 2009)... and Image Net (Deng et al., 2009)... We benchmark on Flickr30K (Young et al., 2014)... and COCO (Lin et al., 2014)... We use SQu AD v1.1 (Rajpurkar et al., 2016)... and SQu AD v2.0 (Rajpurkar et al., 2018)...
Dataset Splits	Yes	We benchmark on Flickr30K (Young et al., 2014) which contains 30K/1K/1K images for training/testing/validation, and COCO (Lin et al., 2014), which contains 123K/1K/1K images for training/testing/validation.
Hardware Specification	Yes	When using 64 V100 GPUs for training WRN-50-2 on Image Net, we see a 1.04 performance improvement in runtime per epoch when using SSNs with 10.5M parameters (15% of the original model).
Software Dependencies	No	The paper mentions 'Py Torch’s ofﬁcial implementation' but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	For each model, we use the authors implementation and hyperparameters, unless noted (more details in Appendix A). Speciﬁcally, on CIFAR we train our model using a batch size of 128 for 200 epochs with weight decay set at 5e-4 and an initial learning rate of 0.1 which we decay using a gamma of 0.2 at 60, 120, and 160 epochs.