reproducibilityindex.ai

Trained Ternary Quantization

Authors: Chenzhuo Zhu, Song Han, Huizi Mao, William J. Dally

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on CIFAR-10 show that the ternary models obtained by trained quantization method outperform full-precision models of Res Net-32,44,56 by 0.04%, 0.16%, 0.36%, respectively. On Image Net, our model outperforms full-precision Alex Net model by 0.3% of Top-1 accuracy and outperforms previous ternary models by 3%.
Researcher Affiliation	Collaboration	Chenzhuo Zhu Tsinghua University zhucz13@mails.tsinghua.edu.cn Song Han Stanford University songhan@stanford.edu Huizi Mao Stanford University huizi@stanford.edu William J. Dally Stanford University NVIDIA dally@stanford.edu
Pseudocode	No	The paper describes procedures and includes a diagram (Figure 1), but it does not provide formal pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We perform our experiments on CIFAR-10 (Krizhevsky & Hinton, 2009) and Image Net (Russakovsky et al., 2015).
Dataset Splits	Yes	CIFAR-10 is an image classiﬁcation benchmark containing images of size 32 32RGB pixels in a training set of 50000 and a test set of 10000. [...] ILSVRC12 is a 1000-category dataset with over 1.2 million images in training set and 50 thousand images in validation set. Images from ILSVRC12 also have various resolutions. We used a variant of Alex Net(Krizhevsky et al. (2012)) structure by removing dropout layers and add batch normalization(Ioffe & Szegedy, 2015) for all models in our experiments.
Hardware Specification	No	The paper mentions that "On custom hardware, multiplications can be pre-computed..." but does not specify the hardware used for running the experiments described in the paper (e.g., specific GPU or CPU models).
Software Dependencies	No	Our network is implemented on both Tensor Flow (Abadi & et. al o, 2015) and Caffe (Jia et al., 2014) frameworks. The paper mentions software names but does not provide specific version numbers for TensorFlow or Caffe.
Experiment Setup	Yes	Learning rate is set to 0.1 at beginning and scaled by 0.1 at epoch 80, 120 and 300. A L2-normalized weight decay of 0.0002 is used as regularizer. Most of our models converge after 160 epochs. We take a moving average on errors of all epochs to ﬁlter off ﬂuctuations when reporting error rate. (...) Minibatch size is set to 128. Learning rate starts at 10 4 and is scaled by 0.2 at epoch 56 and 64. A L2-normalized weight decay of 5 10 6 is used as a regularizer. Images are ﬁrst resized to 256 256 then randomly cropped to 224 224 before input.