reproducibilityindex.ai

How to Train a Compact Binary Neural Network with High Accuracy?

Authors: Wei Tang, Gang Hua, Liang Wang

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our ﬁndings ﬁrst reveal that a low learning rate is highly preferred to avoid frequent sign changes of the weights, which often makes the learning of Binary Nets unstable... The composition of all these enables us to train Binary Nets with both high compression rate and high accuracy, which is strongly supported by our extensive empirical study. Table 1: Comparison of different methods on Image Net dataset.
Researcher Affiliation	Collaboration	Wei Tang,1 Gang Hua,4 Liang Wang1,2,3 1Institute of Automation, Chinese Academy of Sciences (CASIA) 2 Center for Excellence in Brain Science and Intelligence Technology, CAS 3University of Chinese Academy of Sciences 4Microsoft Research, Beijing, China
Pseudocode	Yes	Algorithm 1 Training a L layers Binary Net.
Open Source Code	No	The paper does not contain any explicit statement about making the source code available or a link to a code repository for the methodology described.
Open Datasets	Yes	The image classiﬁcation task on the large-scale Image Net dataset... Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. 2015. Imagenet large scale visual recognition challenge.
Dataset Splits	Yes	We hold out part of training images for hyper-parameter tuning and the ﬁnal model is evaluated on the validation dataset with only single center crop.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used for running experiments.
Software Dependencies	No	We implement our work on Caffe (Jia et al. 2014).
Experiment Setup	Yes	For the hyper-parameters, unless otherwise speciﬁed, the initial learning rate is set to 0.0001 and divided by 2 once the training loss stops decreasing. The parameter λ is set to 5 10 7 and the batch size is set to 256. Just as previous works did, batch normalization layer is used before each binary convolution layer and ADAM is used as the solver.