Channel Pruning via Automatic Structure Search
Authors: Mingbao Lin, Rongrong Ji, Yuxin Zhang, Baochang Zhang, Yongjian Wu, Yonghong Tian
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments We conduct compression for representative networks, including VGGNet, Goog Le Net and Res Net-56/110 on CIFAR-10 [Krizhevsky et al., 2009], and Res Net-18/34/50/101/152 on ILSVRC-2012 [Russakovsky et al., 2015]. |
| Researcher Affiliation | Collaboration | Mingbao Lin1 , Rongrong Ji1 , Yuxin Zhang1 , Baochang Zhang2 , Yongjian Wu3 , Yonghong Tian4 1Media Analytics and Computing Laboratory, Department of Artificial Intelligence, School of Informatics, Xiamen University, China 2School of Automation Science and Electrical Engineering, Beihang University, China 3Tencent Youtu Lab, Tencent Technology (Shanghai) Co., Ltd, China 4School of Electronics Engineering and Computer Science, Peking University, Beijing, China |
| Pseudocode | Yes | Algorithm 1: ABCPruner |
| Open Source Code | Yes | The source codes can be available at https: //github.com/lmbxmu/ABCPruner. |
| Open Datasets | Yes | We conduct compression for representative networks, including VGGNet, Goog Le Net and Res Net-56/110 on CIFAR-10 [Krizhevsky et al., 2009], and Res Net-18/34/50/101/152 on ILSVRC-2012 [Russakovsky et al., 2015]. |
| Dataset Splits | No | The paper refers to 'Ttrain' and 'Ttest' for training and evaluation during the search and fine-tuning. It does not explicitly mention a separate 'validation' dataset split or specify its proportions. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, memory, or other detailed computer specifications used for running its experiments. |
| Software Dependencies | No | The paper mentions general software components like Stochastic Gradient Descent (SGD) but does not provide specific version numbers for any libraries, frameworks, or other software dependencies. |
| Experiment Setup | Yes | We use the Stochastic Gradient Descent algorithm (SGD) for fine-tuning with momentum 0.9 and the batch size is set to 256. On CIFAR-10, the weight decay is set to 5e-3 and we fine-tune the network for 150 epochs with a learning rate of 0.01, which is then divided by 10 every 50 training epochs. On ILSVRC-2012, the weight decay is set to 1e-4 and 90 epochs are given for fine-tuning. The learning rate is set as 0.1, and divided by 10 every 30 epochs. ... For each structure, we train the pruned model N for two epochs to obtain its fitness. We empirically set T =2, n=3, and M=2 in the Alg. 1. |