Sparsity-Inducing Binarized Neural Networks
Authors: Peisong Wang, Xiangyu He, Gang Li, Tianli Zhao, Jian Cheng12192-12199
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our method dramatically outperforms current state-of-the-arts, lowering the performance gap between full-precision networks and BNNs on mainstream architectures, achieving the new state-of-the-art on binarized Alex Net (Top-1 50.5%), Res Net-18 (Top-1 59.7%), and VGG-Net (Top-1 63.2%).In this section, we evaluate the proposed Si-BNN in terms of accuracy and efficiency. Our experiments are conducted on MNIST, CIFAR-10 and Image Net (Deng et al. 2009) datasets. |
| Researcher Affiliation | Academia | Institute of Automation, Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China |
| Pseudocode | No | No clearly labeled pseudocode or algorithm blocks were found in the paper. Figure 1 shows diagrams of forward and backward functions but is not pseudocode. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | Our experiments are conducted on MNIST, CIFAR-10 and Image Net (Deng et al. 2009) datasets.We train all the networks on ILSVRC2012 training dataset for 100 epochs. |
| Dataset Splits | Yes | Table 2: Validation accuracy (%) of Alex Net on Image Net using different ρ.We train all the networks on ILSVRC2012 training dataset for 100 epochs. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., CPU, GPU models, memory, or cloud instances) used for running the experiments. It only mentions general terms like 'CPU' or 'GPU'. |
| Software Dependencies | No | The paper mentions using Adam optimizer but does not specify software environments, frameworks, or library versions (e.g., Python, PyTorch, TensorFlow, CUDA versions) that would be needed to reproduce the experiment. |
| Experiment Setup | Yes | Our binary networks are trained from scratch using Adam (Kingma and Ba 2015) with default settings. The batch size for Image Net networks is 256. We use a weight decay of 1e 6 and momentum of 0.9 in default. Specifically, we set the weight decay of Δ and θ to zero. The initial learning rate is 0.001, then reduced by a factor of 10 at 40th and 80th epoch. We train all the networks on ILSVRC2012 training dataset for 100 epochs. |