Neural Pruning via Growing Regularization
Authors: Huan Wang, Can Qin, Yulun Zhang, Yun Fu
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 EXPERIMENTAL RESULTS Datasets and networks. We first conduct analyses on the CIFAR10/100 datasets Krizhevsky (2009) with Res Net56 He et al. (2016)/VGG19 Simonyan & Zisserman (2015). Then we evaluate our methods on the large-scale Image Net dataset Deng et al. (2009) with Res Net34 and 50 He et al. (2016). For CIFAR datasets, we train our baseline models with accuracies comparable to those in the original papers. For Image Net, we take the official Py Torch Paszke et al. (2019) pre-trained models2 as baseline to maintain comparability with other methods. |
| Researcher Affiliation | Academia | Huan Wang, Can Qin, Yulun Zhang , Yun Fu Northeastern University, Boston, MA, USA {wang.huan, qin.ca}@northeastern.edu, yulun100@gmail.com, yunfu@ece.neu.edu |
| Pseudocode | Yes | Algorithm 1 GReg-1 and GReg-2 Algorithms |
| Open Source Code | Yes | Our code and trained models are publicly available at https://github.com/mingsuntse/regularization-pruning. |
| Open Datasets | Yes | Datasets and networks. We first conduct analyses on the CIFAR10/100 datasets Krizhevsky (2009) with Res Net56 He et al. (2016)/VGG19 Simonyan & Zisserman (2015). Then we evaluate our methods on the large-scale Image Net dataset Deng et al. (2009) with Res Net34 and 50 He et al. (2016). |
| Dataset Splits | Yes | Datasets and networks. We first conduct analyses on the CIFAR10/100 datasets Krizhevsky (2009) with Res Net56 He et al. (2016)/VGG19 Simonyan & Zisserman (2015). Then we evaluate our methods on the large-scale Image Net dataset Deng et al. (2009) with Res Net34 and 50 He et al. (2016). For Image Net, we take the official Py Torch Paszke et al. (2019) pre-trained models2 as baseline to maintain comparability with other methods. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Py Torch Paszke et al. (2019)' but does not provide specific version numbers for PyTorch or any other software dependencies used in the experiments. |
| Experiment Setup | Yes | Training settings. To control the irrelevant factors as we can, for comparison methods that release their pruning ratios, we will adopt their ratios; otherwise, we will use our specified ones. We compare the speedup (measured by FLOPs reduction) since we mainly target model acceleration rather than compression. Detailed training settings (e.g., hyper-parameters and layer pruning ratios) are summarized in the Appendix.Table 5: Training setting summary. For the SGD solver, in the parentheses are the momentum and weight decay. For Image Net, batch size 64 is used for pruning instead of the standard 256, which is because we want to save the training time.Table 8: Hyper-parameters of our methods. |