Cooperative Pruning in Cross-Domain Deep Neural Network Compression

Authors: Shangyu Chen, Wenya Wang, Sinno Jialin Pan

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments are conducted to verify the effectiveness of our proposed method compared with several state-of-the-art approaches in the setting of limited training data. We conduct comparison experiments using the following baseline pruning methods: 1) LWC [Han et al., 2015], 2) OBD [Le Cun et al., 1990], 3) DNS [Guo et al., 2016], 4) LOBS [Dong et al., 2017]. Table 1: Overall results of CIFAR9-STL9 using CIFAR-Net. Table 2: Overall results of Image Net PASCAL, Image Net Caltech256, Image Net Bing using Image Net pre-trained Res Net18. CR is 4% for each layer.
Researcher Affiliation Academia Shangyu Chen , Wenya Wang and Sinno Jialin Pan Nanyang Technological University, Singapore schen025@e.ntu.edu.sg, wangwy@ntu.edu.sg, sinnopan@ntu.edu.sg
Pseudocode Yes Alg.1 illustrates the whole process of Co-Prune:
Open Source Code Yes Codes are available at https://github.com/csyhhu/Co-Prune.
Open Datasets Yes CIFAR9-STL9 is a modified version of combined CIFAR10 and STL10 dataset. ... Image CLEF is a 4-domain image dataset. It extracts 600 images of 12 classes from Image Net [Deng et al., 2009], Caltech-256 [Griffin et al., 2007], PASCAL [Everingham et al., 2010] and Bing, respectively.
Dataset Splits No Since data is quite limited in Image CLEF, we divide each domain into 80% for training and 20% for testing (with class balance). No explicit validation split is mentioned.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) are mentioned in the paper regarding the experimental setup.
Software Dependencies No In practice, Adam [Kingma and Ba, 2014] with initial learning rate 10^-3 is used for Co-Prune and all retraining processes. This mentions an algorithm/optimizer but not specific software libraries or their versions.
Experiment Setup Yes In practice, Adam [Kingma and Ba, 2014] with initial learning rate 10^-3 is used for Co-Prune and all retraining processes. Learning rate will be divided by 10 when training loss increases for 3 consecutive epochs. Training to optimum is considered as learning rate becomes smaller than 10^-6. In Co-Prune, α0 = 0.7, αmin = 0.3, β = 3 for tradeoff between computational time and accuracy.