Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Connectivity Matters: Neural Network Pruning Through the Lens of Effective Sparsity

Authors: Artem Vysogorets, Julia Kempe

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this study, we (i) formulate and illustrate the importance of effective sparsity by reevaluating several recent pruning strategies; (ii) provide algorithms to prune according to and compute effective sparsity; (iii) reconfirm that networks pruned at initialization are robust to layerwise reshuffling of survived edges (Frankle et al., 2021) in the new sparsity framework, and (iv) design efficient layerwise sparsity quotas IGQ for random pruning that perform consistently well across all sparsity regimes. Our experiments encompass modern architectures on commonly used computer vision benchmark datasets: Le Net-300-100 (Lecun et al., 1998) on MNIST, Le Net-5 (Lecun et al., 1998) on CIFAR-10, VGG-19 (Simonyan and Zisserman, 2015) on CIFAR-100, Res Net-18 (He et al., 2016) on Tiny Image Net, and Res Net-50 and Mobile Net V2 (Howard et al., 2017) on Image Net. We place results of VGG-16 (Simonyan and Zisserman, 2015) on CIFAR-10 in Appendix B, as they closely resemble those of VGG-19. Further experimental details are listed in Appendix A.
Researcher Affiliation Academia Artem Vysogorets EMAIL Center for Data Science New York University New York, NY 10011, USA Julia Kempe EMAIL Center for Data Science Courant Institute for Mathematical Sciences New York University New York, NY 10011, USA
Pseudocode Yes Algorithm 1: Approximate Effective Random Pruning
Open Source Code No We use our own implementation of all pruning algorithms in Tensor Flow except for Gra SP, for which we use the original code in Py Torch published by Wang et al. (2020). The paper mentions using their own implementation, but does not provide a link or explicit statement of release for their code.
Open Datasets Yes Our experimental work encompasses seven different architecture-dataset combinations: Le Net300-100 (Lecun et al., 1998) on MNIST (Creative Commons Attribution-Share Alike 3.0 license), Le Net-5 (Lecun et al., 1998) and VGG-16 (Simonyan and Zisserman, 2015) on CIFAR-10 (MIT license), VGG-19 (Simonyan and Zisserman, 2015) on CIFAR-100 (MIT license), and Res Net-18 (He et al., 2016) on Tiny Image Net (MIT license), Res Net-50 and Mobile Net V2 (Howard et al., 2017) on Image Net-2012 (Deng et al., 2009).
Dataset Splits No The paper mentions using well-known datasets such as MNIST, CIFAR-10, CIFAR-100, Tiny Image Net, and Image Net-2012, and discusses data augmentation. However, it does not explicitly provide specific details on how these datasets were split into training, validation, and test sets (e.g., percentages, sample counts, or references to predefined splits with specific citations or file names) within the provided text.
Hardware Specification Yes Training was performed on an internal cluster equipped with NVIDIA RTX-8000, NVIDIA V-100, and AMD MI50 GPUs.
Software Dependencies No We use our own implementation of all pruning algorithms in Tensor Flow except for Gra SP, for which we use the original code in Py Torch published by Wang et al. (2020). The paper mentions 'Tensor Flow' and 'Py Torch' but does not specify version numbers for these software components.
Experiment Setup Yes Table 1: Summary of experimental work. All architectures include batch normalization layers followed by Re LU activations. Models are initialized using Kaiming normal scheme (fan-avg) and optimized by SGD (momentum 0.9) with a stepwise LR schedule (10 drop factor applied on specified drop epochs). The categorical cross-entropy loss function is used for all models. Additionally, Table 1 provides specific values for 'Epochs', 'Drop epochs', 'Batch', 'LR', and 'Decay' for each model.