Collaborative Channel Pruning for Deep Networks

Authors: Hanyu Peng, Jiaxiang Wu, Shifeng Chen, Junzhou Huang

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluation on two benchmark data sets indicates that our proposed CCP algorithm achieves higher classification accuracy with similar computational complexity than other stateof-the-art channel pruning algorithms. In this section, we first evaluate our proposed collaborative channel pruning approach on two benchmark data sets, CIFAR-10 (Krizhevsky, 2009) and ILSVRC-12 (Russakovsky et al., 2015), and demonstrate its advantage over other channel pruning algorithms.
Researcher Affiliation Collaboration Hanyu Peng 1 Jiaxiang Wu 2 Shifeng Chen 1 Junzhou Huang 3 This work was done when Hanyu Peng was an intern at Tencent AI Lab. 1Multimedia Laboratory, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences 2Tencent AI Lab 3 Department of CSE, The University of Texas at Arlington.
Pseudocode Yes Algorithm 1 Collaborative Channel Pruning Input: Training set {(xn, yn)}N n=1 Input: Pre-trained network θ0 = {W(l) 0 }L l=1 Output: Channel pruned network θ = {(β(l), W(l))}L l=1 1: initialize {ui} and {sij} for all layers 2: for n = 1, . . . , N do 3: compute outputs and gradients for (xn, yn) 4: update {ui} and {sij} for all layers 5: end for 6: for l = 1, . . . , L do 7: compute pairwise correlation matrix ˆS 8: solve (22) to obtain binary mask β(l) 9: end for 10: fine-tune the model with binary masks {β(l)}
Open Source Code No The paper does not provide concrete access to its own source code, such as a direct repository link or an explicit statement of code release.
Open Datasets Yes In this section, we first evaluate our proposed collaborative channel pruning approach on two benchmark data sets, CIFAR-10 (Krizhevsky, 2009) and ILSVRC-12 (Russakovsky et al., 2015)
Dataset Splits Yes The CIFAR-10 data set consists of 50k training samples and 10k test samples, drawn from 10 categories. The ILSVRC-12 data set consists of over 1.2M training samples and 50k validation samples from 1000 categories.
Hardware Specification Yes These results are obtained on a Nvidia P40 GPU.
Software Dependencies No The paper mentions implementing the method using 'Py Torch' but does not specify a version number for this or any other software dependency.
Experiment Setup Yes The channel-pruned network is fine-tuned for 200 epochs using SGD with batch size of 128; we set weight decay to 0.0005 and momentum to 0.8. The learning rate starts from 0.1 and is divided by 10 at 60-th, 120-th, and 160-th epochs. The compressed model is then fine-tuned for 100 epochs with an initial learning rate of 10 3 and divided by 10 every 30 epochs. We use a batch size of 128 and set weight decay to 0.0001 and momentum to 0.9.