Loss-aware Binarization of Deep Networks

Authors: Lu Hou, Quanming Yao, James T. Kwok

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on both feedforward and recurrent networks show that the proposed loss-aware binarization algorithm outperforms existing binarization schemes, and is also more robust for wide and deep networks. In this section, we perform experiments on the proposed binarization scheme with both feedforward networks (Sections 4.1 and 4.2) and recurrent neural networks (Sections 4.3 and 4.4).
Researcher Affiliation Academia Lu Hou, Quanming Yao, James T. Kwok Department of Computer Science and Engineering Hong Kong University of Science and Technology Clear Water Bay, Hong Kong {lhouab,qyaoaa,jamesk}@cse.ust.hk
Pseudocode Yes Algorithm 1 Loss-Aware Binarization (LAB) for training a feedforward neural network.
Open Source Code No The paper does not include an unambiguous statement that the authors are releasing their source code, nor does it provide a direct link to a code repository for the methodology described. It only thanks developers of third-party tools.
Open Datasets Yes Experiments are performed on three commonly used data sets: 1. MNIST: This contains 28 28 gray images from ten digit classes. 2. CIFAR-10: This contains 32 32 color images from ten object classes. 3. SVHN: This contains 32 32 color images from ten digit classes.
Dataset Splits Yes 1. MNIST: ...We use 50000 images for training, another 10000 for validation, and the remaining 10000 for testing. 2. CIFAR-10: ...We use 45000 images for training, another 5000 for validation, and the remaining 10000 for testing. 3. SVHN: ...We use 598388 images for training, another 6000 for validation, and the remaining 26032 for testing.
Hardware Specification Yes We also thank NVIDIA for the support of Titan X GPU.
Software Dependencies No We thank Yongqi Zhang for helping with the experiments, and developers of Theano (Theano Development Team, 2016), Pylearn2 (Goodfellow et al., 2013) and Lasagne.
Experiment Setup Yes The maximum number of epochs is 50. The learning rate for the weight-binarized (resp. weight-and-activation-binarized) network starts at 0.01 (resp. 0.005), and decays by a factor of 0.1 at epochs 15 and 25. Batch normalization, with a minibatch size 100, is used to accelerate learning.