reproducibilityindex.ai

Optimization based Layer-wise Magnitude-based Pruning for DNN Compression

Authors: Guiying Li, Chao Qian, Chunhui Jiang, Xiaofen Lu, Ke Tang

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results show that OLMP can achieve the best pruning ratio on Le Net-style models (i.e., 114 times for Le Net-300-100 and 298 times for Le Net-5) compared with some state-of-the-art DNN pruning methods, and can reduce the size of an Alex Net-style network up to 82 times without accuracy loss.
Researcher Affiliation	Academia	Guiying Li1, Chao Qian1, Chunhui Jiang1, Xiaofen Lu2, Ke Tang3 1 Anhui Province Key Lab of Big Data Analysis and Application, University of Science and Technology of China, Hefei 230027, China 2 CERCIA, School of Computer Science, University of Birmingham, Birmingham B15 2TT, UK 3 Shenzhen Key Lab of Computational Intelligence, Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
Pseudocode	No	The paper describes the proposed method conceptually and with flowcharts (Figure 1), but does not provide structured pseudocode or an algorithm block.
Open Source Code	No	The paper does not provide a specific repository link or an explicit statement about the release of source code for the methodology described.
Open Datasets	Yes	For the data sets and models, Le Net-5 and Le Net-300-100 are trained on MNIST, and Alex Net-Caltech is trained on Caltech-256 [Grifﬁn et al., 2006].
Dataset Splits	Yes	For MNIST, a validation set containing 6,000 samples is selected from the training set (60,000 in total) uniformly at random; the remaining samples form the new training set; the test set is untouched (10,000 in total). For Caltech-256, the training set is untouched (15,420 in total) while the validation set (15,187 in total) is uniformly ran- domly divided into a new validation set (7,530 in total) and a new test set (7,657 in total).
Hardware Specification	Yes	All of the experiments are based on Caffe [Jia et al., 2014] and released projects of DS and NCS, and run on a workstation with one Titan X pascal and dual Intel E5-2683 v3@2.0 GHz CPUs.
Software Dependencies	No	The paper mentions that experiments are based on 'Caffe' and 'released projects of DS and NCS', but it does not specify any version numbers for these software components.
Experiment Setup	Yes	In this experiment, we set the hyper-parameters (K, pruning loops, pop N, Tmax) to (1000, 15, 10, 160). For Le Net-300-100, (δ, σ) is set to (8%, 5); for Le Net-5, δ is set to 5%, and σ is set to 5 in the ﬁrst 10 pruning loops and 0.5 since then. The reference model is trained from scratch for 10,000 iterations using SGD which mainly follows the experimental settings in [Zeiler and Fergus, 2014] except that the batch size is 256 here; and the same settings of SGD are also used during retraining. (K, pruning loops, pop N, Tmax, δ) is set to (250, 40 , 8, 200, 8%), and σ is set to 5 in the ﬁrst 28 pruning loops and 0.5 since then.