Sparsity-Inducing Binarized Neural Networks

Authors: Peisong Wang, Xiangyu He, Gang Li, Tianli Zhao, Jian Cheng12192-12199

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our method dramatically outperforms current state-of-the-arts, lowering the performance gap between full-precision networks and BNNs on mainstream architectures, achieving the new state-of-the-art on binarized Alex Net (Top-1 50.5%), Res Net-18 (Top-1 59.7%), and VGG-Net (Top-1 63.2%).In this section, we evaluate the proposed Si-BNN in terms of accuracy and efficiency. Our experiments are conducted on MNIST, CIFAR-10 and Image Net (Deng et al. 2009) datasets.
Researcher Affiliation Academia Institute of Automation, Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China
Pseudocode No No clearly labeled pseudocode or algorithm blocks were found in the paper. Figure 1 shows diagrams of forward and backward functions but is not pseudocode.
Open Source Code No The paper does not provide any explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets Yes Our experiments are conducted on MNIST, CIFAR-10 and Image Net (Deng et al. 2009) datasets.We train all the networks on ILSVRC2012 training dataset for 100 epochs.
Dataset Splits Yes Table 2: Validation accuracy (%) of Alex Net on Image Net using different ρ.We train all the networks on ILSVRC2012 training dataset for 100 epochs.
Hardware Specification No The paper does not specify any particular hardware (e.g., CPU, GPU models, memory, or cloud instances) used for running the experiments. It only mentions general terms like 'CPU' or 'GPU'.
Software Dependencies No The paper mentions using Adam optimizer but does not specify software environments, frameworks, or library versions (e.g., Python, PyTorch, TensorFlow, CUDA versions) that would be needed to reproduce the experiment.
Experiment Setup Yes Our binary networks are trained from scratch using Adam (Kingma and Ba 2015) with default settings. The batch size for Image Net networks is 256. We use a weight decay of 1e 6 and momentum of 0.9 in default. Specifically, we set the weight decay of Δ and θ to zero. The initial learning rate is 0.001, then reduced by a factor of 10 at 40th and 80th epoch. We train all the networks on ILSVRC2012 training dataset for 100 epochs.