Pruning Convolutional Neural Networks for Resource Efficient Inference
Authors: Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, Jan Kautz
ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically study the pruning criteria and procedure detailed in the previous section for a variety of problems. We focus many experiments on transfer learning problems, a setting where pruning seems to excel. We also present results for pruning large networks on their original tasks for more direct comparison with the existing pruning literature. Experiments are performed within Theano (Theano Development Team, 2016). Training and pruning are performed on the respective training sets for each problem, while results are reported on appropriate holdout sets, unless otherwise indicated. |
| Researcher Affiliation | Industry | Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, Jan Kautz NVIDIA {pmolchanov, styree, tkarras, taila, jkautz}@nvidia.com |
| Pseudocode | No | The paper does not include pseudocode or clearly labeled algorithm blocks. Figure 1 shows a flowchart. |
| Open Source Code | No | The paper does not contain an explicit statement or link to open-source code for the described methodology. |
| Open Datasets | Yes | We fine-tune the VGG-16 network (Simonyan & Zisserman, 2014) for classification of bird species using the Caltech-UCSD Birds 200-2011 dataset (Wah et al., 2011). ... Oxford Flowers 102 dataset (Nilsback & Zisserman, 2008)... largescale Image Net classification task. |
| Dataset Splits | Yes | The dataset consists of nearly 6000 training images and 5700 test images, covering 200 species. ... Oxford Flowers 102 dataset (Nilsback & Zisserman, 2008), with 2040 training and 6129 test images from 102 species of flowers. ... Caffe Net implementation of Alex Net with 79.2% top-5 validation accuracy. |
| Hardware Specification | Yes | Table 2: Actual speed up of networks pruned by Taylor criterion for various hardware setup. CPU: Intel Core i7-5930K; GPU: Ge Force GTX TITAN X (Pascal); GPU: NVIDIA Jetson TX1; GPU: Ge Force GT 730M |
| Software Dependencies | Yes | Experiments are performed within Theano (Theano Development Team, 2016). All measurements were performed with Py Torch with cu DNN v5.1.0, except R3DCNN which was implemented in C++ with cu DNN v4.0.4). |
| Experiment Setup | Yes | We fine-tune VGG-16 for 60 epochs with learning rate 0.0001 to achieve a test accuracy of 72.2%... At each pruning iteration, we remove a single feature map and then perform 30 minibatch SGD updates with batch-size 32, momentum 0.9, learning rate 10 4, and weight decay 10 4. |