Towards Data-Agnostic Pruning At Initialization: What Makes a Good Sparse Mask?

Authors: Hoang Pham, The Anh Ta, Shiwei Liu, Lichuan Xiang, Dung Le, Hongkai Wen, Long Tran-Thanh

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental By conducting extensive experiments across different architectures and datasets, our results demonstrate that our approach outperforms state-of-the-art Pa I methods while it is able to discover subnetworks that have much lower inference FLOPs (up to 3.4 ).
Researcher Affiliation Collaboration Hoang Pham1, The-Anh Ta2, Shiwei Liu3,6, Lichuan Xiang4, Dung D. Le5, Hongkai Wen4, Long Tran-Thanh4 1 FPT Software AI Center, 2 CSIRO s Data61, 3 University of Texas at Austin, 4 University of Warwick, 5 Vin University 6 Eindhoven University of Technology
Pseudocode Yes We describe our method in Algorithm 1 and the pseudo code for optimizer in Appendix C.
Open Source Code Yes Code is available at: https://github.com/pvh1602/NPB.
Open Datasets Yes We conduct experiments on three standard datasets: CIFAR-10, CIFAR-100, and Tiny-Imagenet.
Dataset Splits No The paper mentions using CIFAR-10, CIFAR-100, and Tiny-Imagenet datasets, but does not explicitly provide information on training/validation/test splits by percentages or counts.
Hardware Specification Yes We use Pytorch 2 library and conduct experiments on a single GTX 3090Ti or A100 (depend on their available).
Software Dependencies Yes We use Pytorch 2 library and conduct experiments on a single GTX 3090Ti or A100 (depend on their available). ... We use the default mixed integer programming solver in CVXPY library
Experiment Setup Yes Table 2: Summary of the architectures, datasets, and hyperparameters used in experiments. Network Dataset Epochs Batch Optimizer Momentum LR LR Drop, Epoch Weight Decay VGG-19 CIFAR-100 160 128 SGD 0.9 0.1 10x, [60,120] 0.0001 Res Net-20 CIFAR-10 160 128 SGD 0.9 0.1 10x, [60,120] 0.0001 Res Net-18 Tiny-Image Net 100 128 SGD 0.9 0.01 10x, [30,60,80] 0.0001