reproducibilityindex.ai

Optimal Lottery Tickets via Subset Sum: Logarithmic Over-Parameterization is Sufficient

Authors: Ankit Pensia, Shashank Rajput, Alliot Nagle, Harit Vishwakarma, Dimitris Papailiopoulos

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We verify our results empirically by approximating a target network via SUBSETSUM in Experiment 1, and by pruning a sufﬁciently over-parameterized neural network that implements the structures in Figures 1b and 1c in Experiment 2. In both setups, we benchmark on the MNIST [33] dataset, and all training and pruning is accomplished with cosine annealing learning rate decay [34] on a batch size 64 with momentum 0.9 and weight decay 0.0005.
Researcher Affiliation	Academia	Ankit Pensia University of Wisconsin-Madison ankitp@cs.wisc.edu Shashank Rajput University of Wisconsin-Madison rajput3@wisc.edu Alliot Nagle University of Wisconsin-Madison acnagle@wisc.edu Harit Vishwakarma University of Wisconsin-Madison hvishwakarma@cs.wisc.edu Dimitris Papailiopoulos University of Wisconsin-Madison dimitris@papail.io
Pseudocode	No	The paper describes mathematical proofs and experimental procedures in narrative text, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not include any explicit statements or links indicating that the source code for the methodology described in the paper is publicly available.
Open Datasets	Yes	In both setups, we benchmark on the MNIST [33] dataset
Dataset Splits	No	The paper mentions training on MNIST and achieving a "final test set accuracy", but it does not explicitly provide details about training/validation/test dataset splits (e.g., percentages or sample counts for each split).
Hardware Specification	Yes	The 397, 000 weights in our target network were approximated with 3, 725, 871 coefﬁcients in 21.5 hours on 36 cores of a c5.18xlarge AWS EC2 instance.
Software Dependencies	No	The paper mentions using "Gurobi's MIP solver" and cites its reference manual from 2020, but it does not provide a specific version number (e.g., Gurobi X.Y) for the software dependencies used.
Experiment Setup	Yes	...all training and pruning is accomplished with cosine annealing learning rate decay [34] on a batch size 64 with momentum 0.9 and weight decay 0.0005.