Pruning Deep Neural Networks from a Sparsity Perspective
Authors: Enmao Diao, Ganghua Wang, Jiawei Zhang, Yuhong Yang, Jie Ding, Vahid Tarokh
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments corroborate the hypothesis that for a generic pruning procedure, PQI decreases first when a large model is being effectively regularized and then increases when its compressibility reaches a limit that appears to correspond to the beginning of underfitting. Subsequently, PQI decreases again when the model collapse and significant deterioration in the performance of the model start to occur. Additionally, our experiments demonstrate that the proposed adaptive pruning algorithm with proper choice of hyper-parameters is superior to the iterative pruning algorithms such as the lottery ticket-based pruning methods, in terms of both compression efficiency and robustness. Our code is available here. 4 EXPERIMENTAL STUDIES |
| Researcher Affiliation | Academia | Enmao Diao Department of Electrical and Computer Engineering Duke University Durham, NC 27705, USA enmao.diao@duke.edu Ganghua Wang School of Statistics University of Minnesota Minneapolis, MN 55455, USA wang9019@umn.edu Jiawei Zhang School of Statistics University of Minnesota Minneapolis, MN 55455, USA zhan4362@umn.edu Yuhong Yang School of Statistics University of Minnesota Minneapolis, MN 55455, USA yangx374@umn.edu Jie Ding School of Statistics University of Minnesota Minneapolis, MN 55455, USA dingj@umn.edu Vahid Tarokh Department of Electrical and Computer Engineering Duke University Durham, NC 27705, USA vahid.tarokh@duke.edu |
| Pseudocode | Yes | Algorithm 1 Sparsity-informed Adaptive Pruning (SAP) |
| Open Source Code | Yes | Our code is available here. |
| Open Datasets | Yes | We conduct experiments with Fashion MNIST (Xiao et al., 2017), CIFAR10, CIFAR100 (Krizhevsky et al., 2009), and Tiny Image Net (Le & Yang, 2015) datasets. |
| Dataset Splits | No | The paper mentions using a "validation dataset" as a common stopping criterion and using CIFAR10/100, Fashion MNIST, and Tiny Image Net datasets but does not provide specific details on the train/validation/test splits used for these experiments. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., library names, framework versions) used in the experiments. |
| Experiment Setup | Yes | Details of the model architecture and learning hyper-parameters are included in the Appendix. We run experiments for T = 30 pruning iterations with Linear, MLP, and CNN, and T = 15 pruning iterations with Res Net18. We have P = 0.2 throughout our experiments. In these experiments, r and γ are set to 0 and 1 to prevent interference. |