reproducibilityindex.ai

Slot Machines: Discovering Winning Combinations of Random Weights in Neural Networks

Authors: Maxwell M Aladago, Lorenzo Torresani

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the efﬁcacy of our algorithm through experiments on MNIST and CIFAR-10. On MNIST, our randomly weighted Lenet-300-100 (Lecun et al., 1998) obtains a 97.0% test set accuracy when using K = 2 options per connection and 98.2% with K = 8. On CIFAR10 (Krizhevsky, 2009), our six layer convolutional network outperforms the traditionally-trained network when selecting from K = 8 ﬁxed random values at each connection.
Researcher Affiliation	Academia	1 Department of Computer Science, Dartmouth College, Hanover, NH 03755, USA. Correspondence to: Maxwell Mbabilla Aladago <maxwell.m.aladago.gr@dartmouth.edu>.
Pseudocode	No	The paper describes the algorithm mathematically and with a diagram (Figure 1), but it does not include a formal pseudocode block or a section explicitly labeled 'Algorithm'.
Open Source Code	No	The paper does not contain any explicit statement about providing open-source code for the described methodology, nor does it include a link to a code repository.
Open Datasets	Yes	On MNIST, we experiment with the Lenet-300-100 (Lecun et al., 1998) architecture... On CIFAR10 (Krizhevsky, 2009)...
Dataset Splits	Yes	We use 15% and 10% of the training sets of MNIST and CIFAR-10, respectively, for validation. We report performance on the separate test set.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using 'Py Torch (Paszke et al., 2019)' but does not specify any version numbers for PyTorch or other software dependencies.
Experiment Setup	Yes	All models use a batch size of 128 and stochastic gradient descent with warm restarts (Loshchilov & Hutter, 2017), a momentum of 0.9 and an ℓ2 penalty of 0.00014. When training GS slot machines, we set the learning rate to 0.2 for K ≤ 8 and 0.1 otherwise. We set the learning rate to 0.01 when directly optimizing the weights (training from scratch and ﬁnetuning) except when training VGG-19 where we set set the learning rate to 0.1.