Winning the Lottery with Continuous Sparsification
Authors: Pedro Savarese, Hugo Silva, Michael Maire
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show that we surpass the state-of-the-art for both objectives, across models and datasets, including VGG trained on CIFAR-10 and Res Net-50 trained on Image Net. |
| Researcher Affiliation | Academia | Pedro Savarese TTI-Chicago savarese@ttic.edu Hugo Silva University of Alberta hugoluis@ualberta.ca Michael Maire University of Chicago mmaire@uchicago.edu |
| Pseudocode | Yes | Algorithm 1 Iterative Magnitude Pruning [19] Input: Pruning ratio τ, number of rounds R, iterations per round T, rewind point k... Algorithm 2 Continuous Sparsification Input: Mask init s(0), penalty λ, number of rounds R, iterations per round T, rewind point k |
| Open Source Code | Yes | Code available at https://github.com/lolemacs/continuous-sparsification |
| Open Datasets | Yes | VGG trained on CIFAR-10 [23] and Res Net-50 trained on Image Net [24]. |
| Dataset Splits | No | The paper describes training and testing procedures but does not explicitly mention or detail the use of a validation set for model tuning or selection. |
| Hardware Specification | No | The paper mentions using '4 GPUs' for experiments but does not specify the model or type of these GPUs, or any other specific hardware details. |
| Software Dependencies | No | The paper mentions using SGD as an optimizer but does not specify any software libraries (e.g., PyTorch, TensorFlow) or their version numbers that were used for implementation. |
| Experiment Setup | Yes | in each round, we train with SGD, a learning rate of 0.1, and a momentum of 0.9, for a total of 85 epochs, using a batch size of 64 for VGG and 128 for Res Net. We decay the learning rate by a factor of 10 at epochs 56 and 71, and utilize a weight decay of 0.0001. |