Fast and Accurate Binary Neural Networks Based on Depth-Width Reshaping
Authors: Ping Xue, Yang Lu, Jingfei Chang, Xing Wei, Zhen Wei
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the analytical result and the effectiveness of the proposed method. Compared with the original backbones, the DWR backbones constructed by the proposed method result in close to O( s) decrease in activations, while achieving an absolute accuracy increase by up to 1.7% with comparable computational cost. Besides, by using the DWR backbones, existing methods can achieve new state-of-the-art (SOTA) accuracy (e.g., 67.2% on Image Net with Res Net-18 as the original backbone). |
| Researcher Affiliation | Academia | 1School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, China 2Anhui Mine IOT and Security Monitoring Technology Key Laboratory, Hefei, China 3Engineering Research Center of Safety Critical Industrial Measurement and Control Technology, Ministry of Education, Hefei University of Technology, Hefei, China 4Intelligent Manufacturing Institute of He Fei University of Technology, Hefei, China |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/pingxue-hfut/DWR. |
| Open Datasets | Yes | To evaluate the performance of DWR, extensive experiments are conducted on two widely used datasets, including CIFAR-10 (Torralba, Fergus, and Freeman 2008) and Image Net ILSVRC12 (Deng et al. 2009). |
| Dataset Splits | No | The paper mentions training and testing sets for CIFAR-10 and ImageNet but does not explicitly specify a validation set split or methodology for creating one. |
| Hardware Specification | Yes | In fact, we use 2 Nvidia RTX Titan GPUs with the same training scheme for both the original and DWR backbones without additional optimization methods, the original one takes about 4 days, whereas the DWR backbone only needs 3.5 days, which indicates 12.5% speedups. |
| Software Dependencies | No | The paper states 'We implement DWR with Py Torch' but does not specify a version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | Specifically, on CIFAR-10, VGG-Small, VGG-11, Res Net-20, and Res Net-18 are used as the original backbones, respectively, trained using SGD algorithm with weight decay 1e-4, momentum 0.9, and batch size 128; for Image Net, Res Net-18 with the additional shortcut is used as the original backbone (Liu et al. 2020a), trained using Adam optimizer with weight decay 1e-5, momentum 0.9, and batch size 512. |