Fast and Accurate Binary Neural Networks Based on Depth-Width Reshaping

Authors: Ping Xue, Yang Lu, Jingfei Chang, Xing Wei, Zhen Wei

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the analytical result and the effectiveness of the proposed method. Compared with the original backbones, the DWR backbones constructed by the proposed method result in close to O( s) decrease in activations, while achieving an absolute accuracy increase by up to 1.7% with comparable computational cost. Besides, by using the DWR backbones, existing methods can achieve new state-of-the-art (SOTA) accuracy (e.g., 67.2% on Image Net with Res Net-18 as the original backbone).
Researcher Affiliation Academia 1School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, China 2Anhui Mine IOT and Security Monitoring Technology Key Laboratory, Hefei, China 3Engineering Research Center of Safety Critical Industrial Measurement and Control Technology, Ministry of Education, Hefei University of Technology, Hefei, China 4Intelligent Manufacturing Institute of He Fei University of Technology, Hefei, China
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes The code is available at https://github.com/pingxue-hfut/DWR.
Open Datasets Yes To evaluate the performance of DWR, extensive experiments are conducted on two widely used datasets, including CIFAR-10 (Torralba, Fergus, and Freeman 2008) and Image Net ILSVRC12 (Deng et al. 2009).
Dataset Splits No The paper mentions training and testing sets for CIFAR-10 and ImageNet but does not explicitly specify a validation set split or methodology for creating one.
Hardware Specification Yes In fact, we use 2 Nvidia RTX Titan GPUs with the same training scheme for both the original and DWR backbones without additional optimization methods, the original one takes about 4 days, whereas the DWR backbone only needs 3.5 days, which indicates 12.5% speedups.
Software Dependencies No The paper states 'We implement DWR with Py Torch' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup Yes Specifically, on CIFAR-10, VGG-Small, VGG-11, Res Net-20, and Res Net-18 are used as the original backbones, respectively, trained using SGD algorithm with weight decay 1e-4, momentum 0.9, and batch size 128; for Image Net, Res Net-18 with the additional shortcut is used as the original backbone (Liu et al. 2020a), trained using Adam optimizer with weight decay 1e-5, momentum 0.9, and batch size 512.