Towards Data-Agnostic Pruning At Initialization: What Makes a Good Sparse Mask?
Authors: Hoang Pham, The Anh Ta, Shiwei Liu, Lichuan Xiang, Dung Le, Hongkai Wen, Long Tran-Thanh
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | By conducting extensive experiments across different architectures and datasets, our results demonstrate that our approach outperforms state-of-the-art Pa I methods while it is able to discover subnetworks that have much lower inference FLOPs (up to 3.4 ). |
| Researcher Affiliation | Collaboration | Hoang Pham1, The-Anh Ta2, Shiwei Liu3,6, Lichuan Xiang4, Dung D. Le5, Hongkai Wen4, Long Tran-Thanh4 1 FPT Software AI Center, 2 CSIRO s Data61, 3 University of Texas at Austin, 4 University of Warwick, 5 Vin University 6 Eindhoven University of Technology |
| Pseudocode | Yes | We describe our method in Algorithm 1 and the pseudo code for optimizer in Appendix C. |
| Open Source Code | Yes | Code is available at: https://github.com/pvh1602/NPB. |
| Open Datasets | Yes | We conduct experiments on three standard datasets: CIFAR-10, CIFAR-100, and Tiny-Imagenet. |
| Dataset Splits | No | The paper mentions using CIFAR-10, CIFAR-100, and Tiny-Imagenet datasets, but does not explicitly provide information on training/validation/test splits by percentages or counts. |
| Hardware Specification | Yes | We use Pytorch 2 library and conduct experiments on a single GTX 3090Ti or A100 (depend on their available). |
| Software Dependencies | Yes | We use Pytorch 2 library and conduct experiments on a single GTX 3090Ti or A100 (depend on their available). ... We use the default mixed integer programming solver in CVXPY library |
| Experiment Setup | Yes | Table 2: Summary of the architectures, datasets, and hyperparameters used in experiments. Network Dataset Epochs Batch Optimizer Momentum LR LR Drop, Epoch Weight Decay VGG-19 CIFAR-100 160 128 SGD 0.9 0.1 10x, [60,120] 0.0001 Res Net-20 CIFAR-10 160 128 SGD 0.9 0.1 10x, [60,120] 0.0001 Res Net-18 Tiny-Image Net 100 128 SGD 0.9 0.01 10x, [30,60,80] 0.0001 |