reproducibilityindex.ai

Deep Learning with Limited Numerical Precision

Authors: Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, Pritish Narayanan

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test the validity of the proposed approach by training deep neural networks for the MNIST (Lecun & Cortes) and CIFAR10 (Krizhevsky et al., 2012) image classiﬁcation tasks. Deep networks trained using 16-bit wide ﬁxed-point and stochastic rounding achieve nearly the same performance as that obtained when trained using 32-bit ﬂoatingpoint computations.
Researcher Affiliation	Industry	Suyog Gupta SUYOG@US.IBM.COM Ankur Agrawal ANKURAGR@US.IBM.COM Kailash Gopalakrishnan KAILASH@US.IBM.COM IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 Pritish Narayanan PNARAYA@US.IBM.COM IBM Almaden Research Center, San Jose, CA 95120
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. It describes processes but not in a formal pseudocode format.
Open Source Code	No	The paper does not provide concrete access to source code. There is no specific repository link, explicit code release statement, or mention of code in supplementary materials for the methodology described.
Open Datasets	Yes	We test the validity of the proposed approach by training deep neural networks for the MNIST (Lecun & Cortes) and CIFAR10 (Krizhevsky et al., 2012) image classiﬁcation tasks.
Dataset Splits	No	The paper specifies training and test sets for MNIST and CIFAR10 datasets (e.g., '60,000 training images and 10,000 test images' for MNIST), but it does not explicitly mention or specify any validation dataset splits or cross-validation methodology.
Hardware Specification	Yes	Our prototype is implemented on an off-the-shelf FPGA card featuring a Xilinx Kintex325T FPGA and 8 GB DDR3 memory, and communicating with the host PC over a PCIe bus. This FPGA has 840 DSP multiply-accumulate units and almost 2 MB of on-chip block RAM.
Software Dependencies	No	The paper mentions 'vendor-supplied BLAS libraries' and 'Xilinx’s Vivado synthesis and place-and-route tool' but does not provide specific version numbers for these software components.
Experiment Setup	Yes	The weights in each layer are initialized by sampling random values from N (0, 0.01) while the bias vectors are initialized to 0. The network is trained using minibatch stochastic gradient descent (SGD) with a minibatch size of 100 to minimize the cross entropy objective function. For CNNs, an exponentially decreasing learning rate scaling it by a factor of 0.95 after every epoch of training. The learning rate for the ﬁrst epoch is set to 0.1. Momentum (p = 0.9) is used to speed up SGD convergence. The weight decay parameter is set to 0.0005 for all layers.