Pruning’s Effect on Generalization Through the Lens of Training and Regularization
Authors: Tian Jin, Michael Carbin, Dan Roy, Jonathan Frankle, Gintare Karolina Dziugaite
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically demonstrate that both factors are essential to fully explaining pruning s impact on generalization. We use standard architectures: Le Net [32], VGG-16 [56], Res Net-20, Res Net-32 and Res Net-50 [24], and train on benchmarks (MNIST, CIFAR-10, CIFAR-100, Image Net) |
| Researcher Affiliation | Collaboration | Tian Jin1 Michael Carbin1 Daniel M. Roy2 Jonathan Frankle3 Gintare Karolina Dziugaite4 1MIT 2University of Toronto, Vector Institute 3Mosaic ML 4Google Research, Brain Team |
| Pseudocode | No | The paper describes the iterative magnitude pruning algorithm and its components in prose, but it does not include a formally structured pseudocode or algorithm block. |
| Open Source Code | No | The checklist question 3(a) 'Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)?' is answered with '[No]'. |
| Open Datasets | Yes | We use standard architectures: Le Net [32], VGG-16 [56], Res Net-20, Res Net-32 and Res Net-50 [24], and train on benchmarks (MNIST, CIFAR-10, CIFAR-100, Image Net) using standard hyperparameter settings and standard cross-entropy loss function [13, 14, 66]. |
| Dataset Splits | No | The paper mentions 'best validation error' and 'optimally sparse model' which implies the use of a validation set, but it does not provide specific details on the training, validation, and test data splits (e.g., percentages or sample counts). |
| Hardware Specification | Yes | We use Py Torch [50] on TPUs with Open LTH library [13]. |
| Software Dependencies | No | The paper mentions 'Py Torch' and 'Open LTH library' as software used, but it does not provide specific version numbers for these components. |
| Experiment Setup | Yes | Following Frankle and Carbin [13], Frankle et al. [14], we set the t in IMP to t = 0 for MNIST-Le Net benchmark and t = 10 for the others. Appendix B shows further details. |