ShiftAddNet: A Hardware-Inspired Deep Network

Authors: Haoran You, Xiaohan Chen, Yongan Zhang, Chaojian Li, Sicheng Li, Zihao Liu, Zhangyang Wang, Yingyan Lin

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments and ablation studies, all backed up by our FPGA-based Shift Add Net implementation and energy measurements.
Researcher Affiliation Collaboration Department of Electrical and Computer Engineering, Rice University Department of Electrical and Computer Engineering, The University of Texas at Austin Alibaba DAMO Academy {hy34, yz87, cl114, yingyan.lin}@rice.edu, {xiaohan.chen, atlaswang}@utexas.edu {sicheng.li, zihao.liu}@alibaba-inc.com
Pseudocode No The paper includes mathematical formulations for backpropagation (e.g., equations 2, 3, 4, 5, 6) but does not present them in a pseudocode or algorithm block format.
Open Source Code Yes Codes and pre-trained models are available at https://github.com/RICE-EIC/Shift Add Net.
Open Datasets Yes Models and datasets. We consider two DNN models (i.e., Res Net-20 [35] and VGG19-small models [36]) on six datasets: two classification datasets (i.e., CIFAR-10/100) and four Io T datasets (including MHEALTH [37], Flat Cam Face [38], USCHAD [39], and Head-pose detection [40]).
Dataset Splits No The paper specifies training and testing splits for some datasets (e.g., '80% for training and the remaining 20% for testing' for Head-pose), but it does not explicitly mention a separate validation split.
Hardware Specification Yes Specifically, we implement Shift Add Net on a ZYNQ-7 ZC706 FPGA board [9] and collect all real energy measurements for benchmarking. ... FPGA (ZYNQ-7 ZC706)
Software Dependencies No The paper mentions using an 'SGD solver' but does not provide specific version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used.
Experiment Setup Yes Training settings. For the CIFAR-10/100 and Head-pose datasets, the training takes a total of 160 epochs with a batch size of 256, where the initial learning rate is set to 0.1 and then divided by 10 at the 80-th and 120-th epochs, respectively, and a SGD solver is adopted with a momentum of 0.9 and a weight decay of 10 4 following [42].