reproducibilityindex.ai

Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation

Authors: Emily L Denton, Wojciech Zaremba, Joan Bruna, Yann LeCun, Rob Fergus

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Using large state-of-the-art models, we demonstrate speedups of convolutional layers on both CPU and GPU by a factor of 2, while keeping the accuracy within 1% of the original model. We present results showing the performance of the approximations described in Section 3 in terms of prediction accuracy, speedup gains and reduction in memory overhead.
Researcher Affiliation	Academia	Emily Denton, Wojciech Zaremba, Joan Bruna, Yann Le Cun and Rob Fergus Dept. of Computer Science, Courant Institute, New York University {denton, zaremba, bruna, lecun, fergus} @cs.nyu.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper references Alex Krizhevsky’s CUDA convolution routines as a baseline (https://code.google.com/p/cuda-convnet/), but there is no explicit statement or link indicating that the authors' own code for the described methodology is open-source or publicly available.
Open Datasets	Yes	We use the 15 layer convolutional architecture of [8], trained on the Image Net 2012 dataset [9].
Dataset Splits	Yes	All measurements of prediction performance are with respect to the 50K validation images from the Image Net12 dataset.
Hardware Specification	Yes	All GPU code was run on a standard n Vidia Titan card.
Software Dependencies	No	The paper mentions software like C++ with Eigen3 library and Intel MKL, and Alex Krizhevsky's CUDA convolution routines, but it does not specify version numbers for these components.
Experiment Setup	Yes	All of our fine-tuning results were achieved by training with less than 2 passes using the Image Net12 training dataset. Using the monochromatic approximation with 6 colors for the first layer and the biclustering with outer product decomposition approx-imation for the second layer (G = 48; H = 2; K = 8)