Loss-aware Binarization of Deep Networks
Authors: Lu Hou, Quanming Yao, James T. Kwok
ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on both feedforward and recurrent networks show that the proposed loss-aware binarization algorithm outperforms existing binarization schemes, and is also more robust for wide and deep networks. In this section, we perform experiments on the proposed binarization scheme with both feedforward networks (Sections 4.1 and 4.2) and recurrent neural networks (Sections 4.3 and 4.4). |
| Researcher Affiliation | Academia | Lu Hou, Quanming Yao, James T. Kwok Department of Computer Science and Engineering Hong Kong University of Science and Technology Clear Water Bay, Hong Kong {lhouab,qyaoaa,jamesk}@cse.ust.hk |
| Pseudocode | Yes | Algorithm 1 Loss-Aware Binarization (LAB) for training a feedforward neural network. |
| Open Source Code | No | The paper does not include an unambiguous statement that the authors are releasing their source code, nor does it provide a direct link to a code repository for the methodology described. It only thanks developers of third-party tools. |
| Open Datasets | Yes | Experiments are performed on three commonly used data sets: 1. MNIST: This contains 28 28 gray images from ten digit classes. 2. CIFAR-10: This contains 32 32 color images from ten object classes. 3. SVHN: This contains 32 32 color images from ten digit classes. |
| Dataset Splits | Yes | 1. MNIST: ...We use 50000 images for training, another 10000 for validation, and the remaining 10000 for testing. 2. CIFAR-10: ...We use 45000 images for training, another 5000 for validation, and the remaining 10000 for testing. 3. SVHN: ...We use 598388 images for training, another 6000 for validation, and the remaining 26032 for testing. |
| Hardware Specification | Yes | We also thank NVIDIA for the support of Titan X GPU. |
| Software Dependencies | No | We thank Yongqi Zhang for helping with the experiments, and developers of Theano (Theano Development Team, 2016), Pylearn2 (Goodfellow et al., 2013) and Lasagne. |
| Experiment Setup | Yes | The maximum number of epochs is 50. The learning rate for the weight-binarized (resp. weight-and-activation-binarized) network starts at 0.01 (resp. 0.005), and decays by a factor of 0.1 at epochs 15 and 25. Batch normalization, with a minibatch size 100, is used to accelerate learning. |