Dynamic Channel Pruning: Feature Boosting and Suppression
Authors: Xitong Gao, Yiren Zhao, Łukasz Dudziak, Robert Mullins, Cheng-zhong Xu
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We ran extensive experiments on CIFAR-10 (Krizhevsky et al., 2014) and the Image Net ILSVRC2012 (Deng et al., 2009), two popular image classification datasets. ... Empirical results show that under the same speed-ups, FBS can produce models with validation accuracies surpassing all other channel pruning and dynamic conditional execution methods examined in the paper. |
| Researcher Affiliation | Academia | 1 Shenzhen Institutes of Advanced Technology, Shenzhen, China 2,3,4 University of Cambridge, Cambridge, UK 5 University of Macau, Macau, China 1 xt.gao@siat.ac.cn, 2 yaz21@cam.ac.uk |
| Pseudocode | No | The paper does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Finally, the implementation of FBS and the optimized networks are fully open source and released to the public1. 1https://github.com/deep-fry/mayo |
| Open Datasets | Yes | We ran extensive experiments on CIFAR-10 (Krizhevsky et al., 2014) and the Image Net ILSVRC2012 (Deng et al., 2009), two popular image classification datasets. |
| Dataset Splits | Yes | Empirical results show that under the same speed-ups, FBS can produce models with validation accuracies surpassing all other channel pruning and dynamic conditional execution methods examined in the paper. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, memory) used to run its experiments. |
| Software Dependencies | No | The paper mentions using 'conventional stochastic gradient descent' for training but does not provide specific version numbers for any software dependencies, libraries, or frameworks used in the experiments. |
| Experiment Setup | Yes | We trained M-Cifar Net (see Appendix A) with a 0.01 learning rate and a 256 batch size. We reduced the learning rate by a factor of 10 for every 100 epochs. ... ILSVRC2012 classifiers, i.e. Res Net-18 and VGG-16, were trained with a procedure similar to Appendix A. The difference was that they were trained for a maximum of 35 epochs, the learning rate was decayed for every 20 epochs, and NS models were all pruned at 15 epochs. |