Prior Gradient Mask Guided Pruning-Aware Fine-Tuning
Authors: Linhang Cai, Zhulin An, Chuanguang Yang, Yangchun Yan, Yongjun Xu140-148
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on three image classification benchmarks CIFAR10/100 and ILSVRC-2012 demonstrate the effectiveness of our method for various CNN architectures, datasets and pruning rates. Notably, on ILSVRC-2012, PGMPF reduces 53.5% FLOPs on Res Net-50 with only 0.90% top-1 accuracy drop and 0.52% top-5 accuracy drop, which has advanced the state-of-the-art with negligible extra computational cost. |
| Researcher Affiliation | Collaboration | Linhang Cai1,2, Zhulin An1*, Chuanguang Yang1,2, Yangchun Yan3, Yongjun Xu1 1Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 2University of Chinese Academy of Sciences, Beijing, China 3Horizon Robotics, Beijing, China Email: {cailinhang19g, anzhulin, yangchuanguang, xyj}@ict.ac.cn, yangchun.yan@horizon.ai |
| Pseudocode | Yes | Algorithm 1: PGMPF Algorithm |
| Open Source Code | No | The paper does not contain any explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We empirically evaluate our PGMPF for VGGNet and Res Net (Simonyan and Zisserman 2015; He et al. 2016) on three datasets: CIFAR-10/100 and ILSVRC2012 (Krizhevsky 2009; Russakovsky Olga et al. 2015). |
| Dataset Splits | Yes | Both CIFAR-10 and CIFAR-100 consist of 50,000 training images and 10,000 test images of size 32 32 pixels, drawn from 10 classes and 100 classes respectively. ILSVRC-2012 contains 1.28 million training images and 50k validation images divided into 1,000 classes. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types). |
| Software Dependencies | No | The paper mentions the use of deep learning frameworks and models like VGGNet and ResNet, but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | On CIFAR-10/100, we follow the parameter scheme and the training configuration in GHFP and CPMC. On ILSVRC-2012, we follow the parameter setting and the data augmentation scheme in ASFP. The total number of pruning and fine-tuning epochs of CIFAR-10/100 and ILSVRC-2012 are 200 and 100 respectively, following the settings of ASFP, ASRFP and GHFP (He et al. 2019a; Cai et al. 2021b,a). Models are either pruned from scratch or pruned from pretrained models. For pruning pre-trained models, we set the initial learning rate as one-tenth of the original learning rate. Algorithm 1 also details parameters like initial decay rate α0, final pruning rate Pl, β(0)=1, and p=0.5. |