Learning both Weights and Connections for Efficient Neural Network

Authors: Song Han, Jeff Pool, John Tran, William Dally

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We implemented network pruning in Caffe [26]. Caffe was modified to add a mask which disregards pruned parameters during network operation for each weight tensor. The pruning threshold is chosen as a quality parameter multiplied by the standard deviation of a layer s weights. We carried out the experiments on Nvidia Titan X and GTX980 GPUs. We pruned four representative networks: Lenet-300-100 and Lenet-5 on MNIST, together with Alex Net and VGG-16 on Image Net. The network parameters and accuracy 1 before and after pruning are shown in Table 1.
Researcher Affiliation Collaboration Song Han Stanford University songhan@stanford.edu Jeff Pool NVIDIA jpool@nvidia.com John Tran NVIDIA johntran@nvidia.com William J. Dally Stanford University NVIDIA dally@stanford.edu
Pseudocode No The paper contains 'Figure 2: Three-Step Training Pipeline' which illustrates the process, but no structured pseudocode or algorithm blocks are provided.
Open Source Code No The paper does not contain any explicit statement about releasing open-source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets Yes On the Image Net dataset, our method reduced the number of parameters of Alex Net by a factor of 9 , from 61 million to 6.7 million, without incurring accuracy loss. Similar experiments with VGG-16 found that the total number of parameters can be reduced by 13 , from 138 million to 10.3 million, again with no loss of accuracy. We first experimented on MNIST dataset with the Le Net-300-100 and Le Net-5 networks [4].
Dataset Splits Yes We further examine the performance of pruning on the Image Net ILSVRC-2012 dataset, which has 1.2M training examples and 50k validation examples.
Hardware Specification Yes We carried out the experiments on Nvidia Titan X and GTX980 GPUs.
Software Dependencies No The paper states, 'We implemented network pruning in Caffe [26],' but it does not provide a specific version number for Caffe or any other software dependencies.
Experiment Setup Yes After pruning, the network is retrained with 1/10 of the original network s original learning rate. (LeNet) After pruning, the whole network is retrained with 1/100 of the original network s initial learning rate. (AlexNet)