reproducibilityindex.ai

Beyond Network Pruning: a Joint Search-and-Training Approach

Authors: Xiaotong Lu, Han Huang, Weisheng Dong, Xin Li, Guangming Shi

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on Res Net and VGGNet demonstrate the superior performance of our proposed method on popular datasets including CIFAR10, CIFAR100 and Image Net. and 4 Experimental Results
Researcher Affiliation	Academia	Xiaotong Lu1 , Han Huang1 , Weisheng Dong 1 , Xin Li2 , Guangming Shi1 1Xidian University 2West Virginia University {xiaotonglu47, hanhuang8264}@gmail.com, wsdong@mail.xidian.edu.cn, xin.li@ieee.org, gmshi@xidian.edu.cn
Pseudocode	Yes	Algorithm 1: Sampler and Algorithm 2: Search-and-Training Algorithm
Open Source Code	No	The paper does not provide a direct link to its source code or explicitly state that its code is open-source or publicly available.
Open Datasets	Yes	Extensive experiments on Res Net and VGGNet demonstrate the superior performance of our proposed method on popular datasets including CIFAR10, CIFAR100 and Image Net.
Dataset Splits	Yes	For searching and training, we randomly extract 80% of the ofﬁcial training images as the training set Dtrain in Algorithm 2, and the rest as the validation set Dval.
Hardware Specification	Yes	for the experiments on CIFAR-10 datasets, we use one NVIDIA Titan XP GPUs for training and searching, and NVIDIA 1080Ti for CIFAR-100. and All experiments on the Image Net dataset use 4 NVIDIA Titan XP GPUs with batch size of 256. and we use 4 NVIDIA 2080Ti GPUs to validate our method on the Res Net-50 model.
Software Dependencies	No	The paper mentions 'Pytorch' but does not specify a version number for it or any other software dependencies.
Experiment Setup	Yes	For searching and training, we randomly extract 80% of the ofﬁcial training images as the training set Dtrain in Algorithm 2, and the rest as the validation set Dval. In our implementation, we search the compact networks with thresholds in the range [0.6,0.65,0.7,0.75,0.8] for each layer.The hyper-parameter λ is set to 0.1, γ is set to 2.0 and all parameters and weights are initialized by kaiming normal in Pytorch. For different dataset, we apply different settings: 1) On CIFAR dataset, we use SGD with the momentum of 0.9 and the weight decay of 0.00005 as optimizer. At the beginning, we train the target network coarsely for 100 epochs with batch size 128. The learning rate is started from 0.1 and reduced by cosine scheduler. Then we search T = 30 compact network, whose parameters are optimized on Dtrain for M (M = 40 for t 20, 30 for t > 20) epochs with learning rate of 0.05/0.01, corresponding to different M. And the weights are optimized on Dval for N = 5 epochs with ﬁxed learning rate of 0.001. In the ﬁne-tuning stage, we set batch size of 256, learning rate of 0.01 and optimize the selected compact network until convergence. 2) On Image Net datasets, we optimize the parameters via Adam with weight decay of 0.00001 and the weights via the same SGD as CIFAR. For Res Net model, we coarsely train 40 epochs with an initial learning rate of 0.1 and search T = 20 compact networks. M is set to be 10 with learning rate of 0.001 and N is set to be 2. When ﬁnetuning, we set the initial learning rate to 0.001 and divided by 10 every 20 epochs.