Learning both Weights and Connections for Efficient Neural Network
Authors: Song Han, Jeff Pool, John Tran, William Dally
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We implemented network pruning in Caffe [26]. Caffe was modified to add a mask which disregards pruned parameters during network operation for each weight tensor. The pruning threshold is chosen as a quality parameter multiplied by the standard deviation of a layer s weights. We carried out the experiments on Nvidia Titan X and GTX980 GPUs. We pruned four representative networks: Lenet-300-100 and Lenet-5 on MNIST, together with Alex Net and VGG-16 on Image Net. The network parameters and accuracy 1 before and after pruning are shown in Table 1. |
| Researcher Affiliation | Collaboration | Song Han Stanford University songhan@stanford.edu Jeff Pool NVIDIA jpool@nvidia.com John Tran NVIDIA johntran@nvidia.com William J. Dally Stanford University NVIDIA dally@stanford.edu |
| Pseudocode | No | The paper contains 'Figure 2: Three-Step Training Pipeline' which illustrates the process, but no structured pseudocode or algorithm blocks are provided. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing open-source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | On the Image Net dataset, our method reduced the number of parameters of Alex Net by a factor of 9 , from 61 million to 6.7 million, without incurring accuracy loss. Similar experiments with VGG-16 found that the total number of parameters can be reduced by 13 , from 138 million to 10.3 million, again with no loss of accuracy. We first experimented on MNIST dataset with the Le Net-300-100 and Le Net-5 networks [4]. |
| Dataset Splits | Yes | We further examine the performance of pruning on the Image Net ILSVRC-2012 dataset, which has 1.2M training examples and 50k validation examples. |
| Hardware Specification | Yes | We carried out the experiments on Nvidia Titan X and GTX980 GPUs. |
| Software Dependencies | No | The paper states, 'We implemented network pruning in Caffe [26],' but it does not provide a specific version number for Caffe or any other software dependencies. |
| Experiment Setup | Yes | After pruning, the network is retrained with 1/10 of the original network s original learning rate. (LeNet) After pruning, the whole network is retrained with 1/100 of the original network s initial learning rate. (AlexNet) |