PerforatedCNNs: Acceleration through Elimination of Redundant Convolutions
Authors: Mikhail Figurnov, Aizhan Ibraimova, Dmitry P. Vetrov, Pushmeet Kohli
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that this method can reduce the evaluation time of modern CNN architectures proposed in the literature by a factor of 2 4 with a small decrease in accuracy. We use three convolutional neural networks of increasing size and computational complexity: Network in Network [17], Alex Net [14] and VGG-16 [25], see table 1. In all networks, we attempt to perforate all the convolutional layers, except for the 1 1 convolutional layers of NIN. We perform timings on a computer with a quad-core Intel Core i5-4460 CPU, 16 GB RAM and a n Vidia Geforce GTX 980 GPU. The results are presented in table 3. |
| Researcher Affiliation | Collaboration | 1National Research University Higher School of Economics 2Lomonosov Moscow State University 3Yandex 4Skolkovo Institute of Science and Technology 5Microsoft Research |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code is available at https://github.com/mfigurnov/perforated-cnn-matconvnet, https://github.com/mfigurnov/perforated-cnn-caffe. |
| Open Datasets | Yes | We use three convolutional neural networks of increasing size and computational complexity: Network in Network [17], Alex Net [14] and VGG-16 [25], see table 1. In all networks, we attempt to perforate all the convolutional layers, except for the 1 1 convolutional layers of NIN. Network Dataset Error CPU time GPU time Mem. Mult. # conv NIN CIFAR-10 top-1 10.4%... Alex Net Image Net top-5 19.6%... VGG-16 top-5 10.1%... |
| Dataset Splits | No | The paper mentions 'training dataset' and 'training images' and uses performance metrics like 'error increase' and 'speedup'. However, it does not explicitly describe a validation set or specific splits for training, validation, and testing, nor does it detail how hyperparameters were tuned using a validation set. Therefore, it does not provide specific dataset split information for validation needed to reproduce the data partitioning. |
| Hardware Specification | Yes | We perform timings on a computer with a quad-core Intel Core i5-4460 CPU, 16 GB RAM and a n Vidia Geforce GTX 980 GPU. |
| Software Dependencies | No | For Alex Net, the Caffe reimplementation is used which is slightly different from the original architecture (pooling and normalization layers are swapped). We use a fork of Mat Conv Net framework for all experiments, except for fine-tuning of Alex Net and VGG-16, for which we use a fork of Caffe. The source code is available at https://github.com/mfigurnov/perforated-cnn-matconvnet, https://github.com/mfigurnov/perforated-cnn-caffe. The paper mentions specific software frameworks (Caffe, Mat Conv Net) but does not provide their version numbers. |
| Experiment Setup | Yes | The batch size used for timings is 128 for NIN, 256 for Alex Net and 16 for VGG-16. We use twenty perforation rates: 1 3, . . . , 18 20. In order to decrease the error of the accelerated network, we tune the network s weights. We do not observe any problems with backpropagation, such as exploding/vanishing gradients. The results are presented in table 3. Finally, we perform the second round of fine-tuning with a much lower learning rate of 1e-9, due to exploding gradients. |