reproducibilityindex.ai

Runtime Neural Pruning

Authors: Ji Lin, Yongming Rao, Jiwen Lu, Jie Zhou

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conducted experiments on three different datasets including CIFAR-10, CIFAR-100 [22] and ILSVRC2012 [36] to show the effectiveness of our method. Experimental results on the CIFAR [22] and Image Net [36] datasets show that our framework successfully learns to allocate different amount of computational resources for different input images, and achieves much better performance at the same cost.
Researcher Affiliation	Academia	Ji Lin Department of Automation Tsinghua University lin-j14@mails.tsinghua.edu.cn Yongming Rao Department of Automation Tsinghua University raoyongming95@gmail.com Jiwen Lu Department of Automation Tsinghua University lujiwen@tsinghua.edu.cn Jie Zhou Department of Automation Tsinghua University jzhou@tsinghua.edu.cn
Pseudocode	Yes	Algorithm 1 Runtime neural pruning for solving optimization problem (1): Input: training set with labels {X} Output: backbone CNN C, decision network D 1: initialize: train C in normal way or initialize C with pre-trained model 2: for i 1, 2, ..., M do 3: // train decision network 4: for j 1, 2, ..., N1 do 5: Sample random minibatch from {X} 6: Forward and sample ϵ-greedy actions {st, at} 7: Compute corresponding rewards {rt} 8: Backward Q values for each stage and generate θLre 9: Update θ using θLre 10: end for 11: // ﬁne-tune backbone CNN 12: for k 1, 2, ..., N2 do 13: Sample random minibatch from {X} 14: Forward and calculate Lcls after runtime pruning by D 15: Backward and generate CLcls 16: Update C using CLcls 17: end for 18: end for 19: return C and D
Open Source Code	No	The paper mentions using 'the modiﬁed Caffe toolbox [20]' for implementation but does not provide a specific link or statement about releasing their own source code for the RNP framework.
Open Datasets	Yes	We conducted experiments on three different datasets including CIFAR-10, CIFAR-100 [22] and ILSVRC2012 [36] to show the effectiveness of our method.
Dataset Splits	Yes	We evaluated the top-5 error using single-view testing on ILSVRC2012-val set and trained RNP model using ILSVRC2012-train set.
Hardware Specification	Yes	Inference time were measured on a Titan X (Pascal) GPU with batch size 64.
Software Dependencies	No	The paper states 'All our experiments were implemented using the modiﬁed Caffe toolbox [20]' but does not provide a specific version number for Caffe or any other software dependencies.
Experiment Setup	Yes	The initialization was trained using SGD, with an initial learning rate 0.01, decay by a factor of 10 after 120, 160 epochs, with totally 200 epochs in total. The other training progress was conducted using RMSprop [42] with the learning rate of 1e-6. For the ϵ-greedy strategy, the hyper-parameter ϵ was annealed linearly from 1.0 to 0.1 in the beginning and ﬁxed at 0.1 thereafter. For most experiments, we set the number of convolutional group to k = 4... During the training, we set the penalty for extra feature map calculation as p = 0.1... The scale α factor was set such that the average αLcls is approximately 1