Cooperative Pruning in Cross-Domain Deep Neural Network Compression
Authors: Shangyu Chen, Wenya Wang, Sinno Jialin Pan
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted to verify the effectiveness of our proposed method compared with several state-of-the-art approaches in the setting of limited training data. We conduct comparison experiments using the following baseline pruning methods: 1) LWC [Han et al., 2015], 2) OBD [Le Cun et al., 1990], 3) DNS [Guo et al., 2016], 4) LOBS [Dong et al., 2017]. Table 1: Overall results of CIFAR9-STL9 using CIFAR-Net. Table 2: Overall results of Image Net PASCAL, Image Net Caltech256, Image Net Bing using Image Net pre-trained Res Net18. CR is 4% for each layer. |
| Researcher Affiliation | Academia | Shangyu Chen , Wenya Wang and Sinno Jialin Pan Nanyang Technological University, Singapore schen025@e.ntu.edu.sg, wangwy@ntu.edu.sg, sinnopan@ntu.edu.sg |
| Pseudocode | Yes | Alg.1 illustrates the whole process of Co-Prune: |
| Open Source Code | Yes | Codes are available at https://github.com/csyhhu/Co-Prune. |
| Open Datasets | Yes | CIFAR9-STL9 is a modified version of combined CIFAR10 and STL10 dataset. ... Image CLEF is a 4-domain image dataset. It extracts 600 images of 12 classes from Image Net [Deng et al., 2009], Caltech-256 [Griffin et al., 2007], PASCAL [Everingham et al., 2010] and Bing, respectively. |
| Dataset Splits | No | Since data is quite limited in Image CLEF, we divide each domain into 80% for training and 20% for testing (with class balance). No explicit validation split is mentioned. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) are mentioned in the paper regarding the experimental setup. |
| Software Dependencies | No | In practice, Adam [Kingma and Ba, 2014] with initial learning rate 10^-3 is used for Co-Prune and all retraining processes. This mentions an algorithm/optimizer but not specific software libraries or their versions. |
| Experiment Setup | Yes | In practice, Adam [Kingma and Ba, 2014] with initial learning rate 10^-3 is used for Co-Prune and all retraining processes. Learning rate will be divided by 10 when training loss increases for 3 consecutive epochs. Training to optimum is considered as learning rate becomes smaller than 10^-6. In Co-Prune, α0 = 0.7, αmin = 0.3, β = 3 for tradeoff between computational time and accuracy. |