Dynamic Structure Pruning for Compressing CNNs
Authors: Jun-Hyung Park, Yeachan Kim, Junho Kim, Joon-Young Choi, SangKeun Lee
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results show that dynamic structure pruning achieves state-of-the-art pruning performance and better realistic acceleration on a GPU compared with channel pruning. In particular, it reduces the FLOPs of Res Net50 by 71.85% without accuracy degradation on the Image Net dataset. Our code is available at https://github.com/irishev/DSP. |
| Researcher Affiliation | Academia | Jun-Hyung Park1, Yeachan Kim2, Junho Kim2, Joon-Young Choi2, Sang Keun Lee1,2 1Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea 2Department of Artificial Intelligence, Korea University, Seoul, Republic of Korea {irish07, yeachan, monocrat, johnjames, yalphy}@korea.ac.kr |
| Pseudocode | Yes | Algorithm 1: Dynamic Structure Pruning |
| Open Source Code | Yes | Our code is available at https://github.com/irishev/DSP. |
| Open Datasets | Yes | We validate the effectiveness of dynamic structure pruning through extensive experiments with diverse network architectures on the CIFAR-10 (Krizhevsky, Hinton et al. 2009) and Image Net (Deng et al. 2009) datasets. |
| Dataset Splits | Yes | We report test/validation accuracy of pruned models (P. Acc.) for CIFAR-10/Image Net, accuracy difference between the original and pruned models ( Acc.), and pruning rates of parameters (Params ) and FLOPs (FLOPs ). |
| Hardware Specification | Yes | The experiments are implemented using Pytorch and conducted on a Linux machine with an Intel i9-10980XE CPU and 4 NVIDIA RTX A5000 GPUs. |
| Software Dependencies | No | The paper mentions implementing experiments using Pytorch, but does not provide specific version numbers for Pytorch or any other software libraries or dependencies. |
| Experiment Setup | Yes | We search the hyperparameters for dynamic structure pruning based on the empirical analysis, i.e., the value of τ {0.125, 0.25, 0.5, 1}, λ {5e-4, 1e-3, 2e-3, 3e-3} for CIFAR-10 and λ {1e-4, 2e-4, 3e-4, 5e-4} for Image Net. We use Adam optimizer with a learning rate of 0.001 and momentum of (0.9, 0.999) to train group parameters. During differentiable group learning, we set the initial learning rate to 0.05, and train models for 120 and 60 epochs in the CIFAR-10 and Image Net experiments, respectively. Then, pruned models are fine-tuned for 80 epochs with initial learning rates of 0.015 and 0.05 for five and three iterations in the CIFAR-10 and Image Net experiments, respectively. We use a cosine learning rate scheduling with weight decay of 1e-3 and 3e-5 for the CIFAR-10 and Image Net experiments, respectively, to yield the best results fitted to our additional regularization. |