BiBench: Benchmarking and Analyzing Network Binarization

Authors: Haotong Qin, Mingyuan Zhang, Yifu Ding, Aoyu Li, Zhongang Cai, Ziwei Liu, Fisher Yu, Xianglong Liu

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To close this gap, we present Bi Bench, a rigorously designed benchmark with in-depth analysis for network binarization. We first carefully scrutinize the requirements of binarization in the actual production and define evaluation tracks and metrics for a comprehensive and fair investigation. Then, we evaluate and analyze a series of milestone binarization algorithms...
Researcher Affiliation Academia 1Beihang University 2ETH Z urich 3Nanyang Technological University.
Pseudocode No The paper describes implementation details and provides equations but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes The code for our Bi Bench is released here.
Open Datasets Yes For the widely-evaluated 2D visual modality tasks, we include image classification on CIFAR10 (Krizhevsky et al., 2014) and Image Net (Krizhevsky et al., 2012) datasets, as well as object detection on PASCAL VOC (Hoiem et al., 2009) and COCO (Lin et al., 2014) datasets.
Dataset Splits Yes Each class has 6000 images, where 5000 are for training and 1000 are for testing.
Hardware Specification Yes We focus on ARM CPU inference on mainstream hardware for edge scenarios, such as HUAWEI Kirin, Qualcomm Snapdragon, Apple M1, Media Tek Dimensity, and Raspberry Pi (details in Appendix A.4). Their products include Kirin 970, Kirin 980, Kirin 985, etc.
Software Dependencies No Bi Bench is implemented using the Py Torch (Paszke et al., 2019) package." The paper mentions PyTorch but does not specify its version number or any other software dependencies with versions required for replication.
Experiment Setup Yes Binarized networks are trained for the same number of epochs as their full-precision counterparts. Inspired by the results in Section 5.2.1, we use the Adam optimizer for all binarized models for well converging. The default initial learning rate is 1e 3 (or 0.1 the default learning rate), and the learning rate scheduler is Cosine Annealing LR (Loshchilov & Hutter, 2017).