Channel Pruning Guided by Classification Loss and Feature Importance

Authors: Jinyang Guo, Wanli Ouyang, Dong Xu10885-10892

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our comprehensive experiments on three benchmark datasets, i.e., CIFAR-10, Image Net, and UCF-101, demonstrate the effectiveness of our CPLI method.
Researcher Affiliation Collaboration Jinyang Guo,1 Wanli Ouyang,2 Dong Xu1 1School of Electrical and Information Engineering, The University of Sydney 2The University of Sydney, Sense Time Computer Vision Research Group, Australia {jinyang.guo, wanli.ouyang, dong.xu}@sydney.edu.au
Pseudocode Yes Algorithm 1 presents the pseudo code of our CPLI approach for pruning a pre-trained model.
Open Source Code No The paper does not provide any explicit statement or link regarding the availability of open-source code for the methodology described.
Open Datasets Yes We take three popular models VGGNet (Simonyan and Zisserman 2014), Res Net-56 (He et al. 2016), and Mobile Net V2 (Sandler et al. 2018) on the CIFAR-10 dataset to demonstrate the effectiveness of the proposed approach. The CIFAR-10 dataset (Krizhevsky 2009) consists of 50k training samples and 10k testing images from 10 classes.
Dataset Splits No The paper states the total number of training and testing samples for datasets like CIFAR-10 (50k training, 10k testing) and ImageNet (1.28 million training, 50k testing), but does not explicitly specify a validation split or percentages for train/validation/test splits.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory, or cloud computing instances).
Software Dependencies No The paper mentions optimization methods like SGD and loss functions but does not specify any software dependencies (e.g., deep learning frameworks like TensorFlow or PyTorch) or their version numbers required for reproduction.
Experiment Setup Yes At the fine-tuning stage, similar to (Zhuang et al. 2018), we use SGD with nesterov for optimization. The momentum, the weight decay, and the mini-batch size are set to 0.9 and 0.0001, and 256, respectively. The initial learning rate is set to 0.1 and step learning rate decay is used.