reproducibilityindex.ai

Loss-aware Weight Quantization of Deep Networks

Authors: Lu Hou, James T. Kwok

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on feedforward and recurrent neural networks show that the proposed scheme outperforms state-of-the-art weight quantization algorithms, and is as accurate (or even more accurate) than the full-precision network.
Researcher Affiliation	Academia	Lu Hou, James T. Kwok Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong {lhouab, jamesk}@cse.ust.hk
Pseudocode	Yes	Algorithm 3 Loss-Aware Ternarization (LAT) for training a feedforward neural network. Algorithm 4 Exact solver for ˆwt l with two scaling parameters. Algorithm 5 Approximate solver for ˆwt l with two scaling parameters
Open Source Code	No	No explicit statement about releasing their own source code was found.
Open Datasets	Yes	1. MNIST: This contains 28 28 gray images from 10 digit classes. We use 50, 000 images for training, another 10, 000 for validation, and the remaining 10, 000 for testing. 2. CIFAR-10: This contains 32 32 color images from 10 object classes. We use 45, 000 images for training, another 5, 000 for validation, and the remaining 10, 000 for testing. 3. CIFAR-100: This contains 32 32 color images from 100 object classes. We use 45, 000 images for training, another 5, 000 for validation, and the remaining 10, 000 for testing. 4. SVHN: This contains 32 32 color images from 10 digit classes. We use 598, 388 images for training, another 6, 000 for validation, and the remaining 26, 032 for testing. The Penn Treebank data set (Taylor et al., 2003): ... with 5,017K characters for training, 393K for validation, and 442K characters for testing.
Dataset Splits	Yes	1. MNIST: ... 10, 000 for validation... 2. CIFAR-10: ... 5, 000 for validation... 3. CIFAR-100: ... 5, 000 for validation... 4. SVHN: ... 6, 000 for validation... The Penn Treebank data set (Taylor et al., 2003): ... 393K for validation...
Hardware Specification	No	The paper mentions "NVIDIA for the gift of GPU card" but does not specify any particular GPU model, CPU, or other hardware components used for experiments.
Software Dependencies	Yes	We thank the developers of Theano (Theano Development Team, 2016), Pylearn2 (Goodfellow et al., 2013) and Lasagne.
Experiment Setup	Yes	For MNIST: 'Batch normalization with a minibatch size 100, is used to accelerate learning. The maximum number of epochs is 50. The learning rate starts at 0.01, and decays by a factor of 0.1 at epochs 15 and 25.' For LSTMs: 'We use a one-layer LSTM with 512 cells. The maximum number of epochs is 200, and the number of time steps is 100. The initial learning rate is 0.002. After 10 epochs, it is decayed by a factor of 0.98 after each epoch. The weights are initialized uniformly in [0.08, 0.08]. After each iteration, the gradients are clipped to the range [ 5, 5].'