Prior Gradient Mask Guided Pruning-Aware Fine-Tuning

Authors: Linhang Cai, Zhulin An, Chuanguang Yang, Yangchun Yan, Yongjun Xu140-148

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on three image classification benchmarks CIFAR10/100 and ILSVRC-2012 demonstrate the effectiveness of our method for various CNN architectures, datasets and pruning rates. Notably, on ILSVRC-2012, PGMPF reduces 53.5% FLOPs on Res Net-50 with only 0.90% top-1 accuracy drop and 0.52% top-5 accuracy drop, which has advanced the state-of-the-art with negligible extra computational cost.
Researcher Affiliation Collaboration Linhang Cai1,2, Zhulin An1*, Chuanguang Yang1,2, Yangchun Yan3, Yongjun Xu1 1Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 2University of Chinese Academy of Sciences, Beijing, China 3Horizon Robotics, Beijing, China Email: {cailinhang19g, anzhulin, yangchuanguang, xyj}@ict.ac.cn, yangchun.yan@horizon.ai
Pseudocode Yes Algorithm 1: PGMPF Algorithm
Open Source Code No The paper does not contain any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We empirically evaluate our PGMPF for VGGNet and Res Net (Simonyan and Zisserman 2015; He et al. 2016) on three datasets: CIFAR-10/100 and ILSVRC2012 (Krizhevsky 2009; Russakovsky Olga et al. 2015).
Dataset Splits Yes Both CIFAR-10 and CIFAR-100 consist of 50,000 training images and 10,000 test images of size 32 32 pixels, drawn from 10 classes and 100 classes respectively. ILSVRC-2012 contains 1.28 million training images and 50k validation images divided into 1,000 classes.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types).
Software Dependencies No The paper mentions the use of deep learning frameworks and models like VGGNet and ResNet, but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes On CIFAR-10/100, we follow the parameter scheme and the training configuration in GHFP and CPMC. On ILSVRC-2012, we follow the parameter setting and the data augmentation scheme in ASFP. The total number of pruning and fine-tuning epochs of CIFAR-10/100 and ILSVRC-2012 are 200 and 100 respectively, following the settings of ASFP, ASRFP and GHFP (He et al. 2019a; Cai et al. 2021b,a). Models are either pruned from scratch or pruned from pretrained models. For pruning pre-trained models, we set the initial learning rate as one-tenth of the original learning rate. Algorithm 1 also details parameters like initial decay rate α0, final pruning rate Pl, β(0)=1, and p=0.5.