Collaborative Channel Pruning for Deep Networks
Authors: Hanyu Peng, Jiaxiang Wu, Shifeng Chen, Junzhou Huang
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluation on two benchmark data sets indicates that our proposed CCP algorithm achieves higher classification accuracy with similar computational complexity than other stateof-the-art channel pruning algorithms. In this section, we first evaluate our proposed collaborative channel pruning approach on two benchmark data sets, CIFAR-10 (Krizhevsky, 2009) and ILSVRC-12 (Russakovsky et al., 2015), and demonstrate its advantage over other channel pruning algorithms. |
| Researcher Affiliation | Collaboration | Hanyu Peng 1 Jiaxiang Wu 2 Shifeng Chen 1 Junzhou Huang 3 This work was done when Hanyu Peng was an intern at Tencent AI Lab. 1Multimedia Laboratory, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences 2Tencent AI Lab 3 Department of CSE, The University of Texas at Arlington. |
| Pseudocode | Yes | Algorithm 1 Collaborative Channel Pruning Input: Training set {(xn, yn)}N n=1 Input: Pre-trained network θ0 = {W(l) 0 }L l=1 Output: Channel pruned network θ = {(β(l), W(l))}L l=1 1: initialize {ui} and {sij} for all layers 2: for n = 1, . . . , N do 3: compute outputs and gradients for (xn, yn) 4: update {ui} and {sij} for all layers 5: end for 6: for l = 1, . . . , L do 7: compute pairwise correlation matrix ˆS 8: solve (22) to obtain binary mask β(l) 9: end for 10: fine-tune the model with binary masks {β(l)} |
| Open Source Code | No | The paper does not provide concrete access to its own source code, such as a direct repository link or an explicit statement of code release. |
| Open Datasets | Yes | In this section, we first evaluate our proposed collaborative channel pruning approach on two benchmark data sets, CIFAR-10 (Krizhevsky, 2009) and ILSVRC-12 (Russakovsky et al., 2015) |
| Dataset Splits | Yes | The CIFAR-10 data set consists of 50k training samples and 10k test samples, drawn from 10 categories. The ILSVRC-12 data set consists of over 1.2M training samples and 50k validation samples from 1000 categories. |
| Hardware Specification | Yes | These results are obtained on a Nvidia P40 GPU. |
| Software Dependencies | No | The paper mentions implementing the method using 'Py Torch' but does not specify a version number for this or any other software dependency. |
| Experiment Setup | Yes | The channel-pruned network is fine-tuned for 200 epochs using SGD with batch size of 128; we set weight decay to 0.0005 and momentum to 0.8. The learning rate starts from 0.1 and is divided by 10 at 60-th, 120-th, and 160-th epochs. The compressed model is then fine-tuned for 100 epochs with an initial learning rate of 10 3 and divided by 10 every 30 epochs. We use a batch size of 128 and set weight decay to 0.0001 and momentum to 0.9. |