Equal Bits: Enforcing Equally Distributed Binary Network Weights

Authors: Yunqiang Li, Silvia-Laura Pintea, Jan C van Gemert1491-1499

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We investigate experimentally that equal bit ratios are indeed preferable and show that our method leads to optimization benefits. We show that our quantization method is effective when compared to state-of-the-art binarization methods, even when using binary weight pruning.
Researcher Affiliation Academia Computer Vision Lab, Delft University of Technology, Netherlands {Y.Li-19, S.L.Pintea, J.C.van Gemert}@tudelft.nl
Pseudocode No The paper describes the method using mathematical equations and textual explanations but does not include any explicit pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/liyunqianggyn/Equal-Bits-BNN.
Open Datasets Yes We evaluate on Cifar-10, Cifar-100 (Krizhevsky and Hinton 2009) and Image Net (Deng et al. 2009)
Dataset Splits No The paper mentions training on various datasets and provides hyperparameters but does not explicitly describe the use of a validation set or its split details.
Hardware Specification No The paper does not explicitly describe the specific hardware used to run the experiments (e.g., GPU/CPU models, memory details).
Software Dependencies No The paper does not provide specific software dependency details with version numbers (e.g., library names with versions).
Experiment Setup Yes We train the shallow models on Cifar-10 for 100 epochs, with weight decay 1e 4, momentum 0.9, batch size 128, and initial learning rate 0.1 using a cosine learning rate decay (Loshchilov and Hutter 2017). Following (Qin et al. 2020b) we also evaluate their Res Net-20 architecture and settings on Cifar-10. On Cifar-100, we evaluate our method on 5 different models... We train the Cifar100 models for 350 epochs using SGD with weight decay 5e 4, momentum 0.9, batch size 128, and initial learning rate 0.1 divided by 10 at epochs 150, 250 and 320. For Image Net we use Res Net-18 and Res Net-34 trained for 100 epochs using SGD with momentum 0.9, weight decay 1e 4, and batch size 256. Following (Liu et al. 2018; Qin et al. 2020b), the initial learning rate is set as 0.1 and we divide it by 10 at epochs 30, 60, 90.