Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation
Authors: Emily L Denton, Wojciech Zaremba, Joan Bruna, Yann LeCun, Rob Fergus
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using large state-of-the-art models, we demonstrate speedups of convolutional layers on both CPU and GPU by a factor of 2, while keeping the accuracy within 1% of the original model. We present results showing the performance of the approximations described in Section 3 in terms of prediction accuracy, speedup gains and reduction in memory overhead. |
| Researcher Affiliation | Academia | Emily Denton, Wojciech Zaremba, Joan Bruna, Yann Le Cun and Rob Fergus Dept. of Computer Science, Courant Institute, New York University {denton, zaremba, bruna, lecun, fergus} @cs.nyu.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper references Alex Krizhevsky’s CUDA convolution routines as a baseline (https://code.google.com/p/cuda-convnet/), but there is no explicit statement or link indicating that the authors' own code for the described methodology is open-source or publicly available. |
| Open Datasets | Yes | We use the 15 layer convolutional architecture of [8], trained on the Image Net 2012 dataset [9]. |
| Dataset Splits | Yes | All measurements of prediction performance are with respect to the 50K validation images from the Image Net12 dataset. |
| Hardware Specification | Yes | All GPU code was run on a standard n Vidia Titan card. |
| Software Dependencies | No | The paper mentions software like C++ with Eigen3 library and Intel MKL, and Alex Krizhevsky's CUDA convolution routines, but it does not specify version numbers for these components. |
| Experiment Setup | Yes | All of our fine-tuning results were achieved by training with less than 2 passes using the Image Net12 training dataset. Using the monochromatic approximation with 6 colors for the first layer and the biclustering with outer product decomposition approx-imation for the second layer (G = 48; H = 2; K = 8) |