Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition

Authors: Vadim Lebedev, Yaroslav Ganin, Victor Lempitsky, Maksim Rakhuba, and Ivan Oseledets

ICLR 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate this approach on two CNNs and show that it is competitive with previous approaches, leading to higher obtained CPU speedups at the cost of lower accuracy drops for the smaller of the two networks.
Researcher Affiliation Collaboration 1Skolkovo Institute of Science and Technology (Skoltech), Moscow, Russia 2Yandex, Moscow, Russia 3Moscow Institute of Physics and Technology, Moscow Region, Russia 4Institute of Numerical Mathematics RAS, Moscow, Russia
Pseudocode No The paper describes the methods using mathematical equations and textual explanations, but no explicit pseudocode or algorithm blocks are provided.
Open Source Code No The paper mentions using the Caffe package and Tensorlab, but does not provide a statement or link for the open-source code of the methodology described in this paper.
Open Datasets Yes We use use CNN described in (Jaderberg et al., 2014b) for our experiments. The network has four convolutional layers with maxout nonlinearities between them and a softmax output. It was trained to classify 24x24 image patches into one of 36 classes (10 digits plus 26 characters). Our Caffe port of the publicly available pre-trained model (refered below as Char Net) achieves 91.2% accuracy on test set (very similar to the original). Following Denton et al. (2014) we also consider the second convolutional layer of Alex Net Krizhevsky et al. (2012).
Dataset Splits No The paper mentions training data and a test set but does not explicitly provide details about a validation dataset split or percentages.
Hardware Specification No The paper states "all timings are based on Caffe code run in the CPU mode on image batches of size 64" but does not provide specific CPU models, GPU models, or other detailed hardware specifications.
Software Dependencies Yes As a software package to calculate the CP-decomposition we used Tensorlab (Sorber et al., 2014). (Reference lists "Tensorlab v2.0")
Experiment Setup Yes All results are reported for a number of ranks R. We have applied our methods to two layers of the network using the following procedure. Firstly, layer 2 was approximated with rank 64. After that, the drop in accuracy was made small by fine-tuning of all layers but the new ones. Finally, layer 3 was approximated with rank 64.