Towards Accurate Post-training Network Quantization via Bit-Split and Stitching
Authors: Peisong Wang, Qiang Chen, Xiangyu He, Jian Cheng
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate the efficiency of our proposed method. We first evaluate the Bit-Split and Stitching method for weight quantization. Then the performance of Error Compensated Activation Quantization is evaluated. We also compare our method with current post-training methods. All bit-width representations throughout this paper take the sign bit into consideration. Codes are available on Git Hub at https://github.com/wps712/Bit Split. |
| Researcher Affiliation | Academia | NLPR & AIRIA, Institute of Automation, Chinese Academy of Sciences. |
| Pseudocode | Yes | Algorithm 1 Post-training quantization using Error Compensated Activation Quantization and Bit-Split and Stitching weight quantization. |
| Open Source Code | Yes | Codes are available on Git Hub at https://github.com/wps712/Bit Split. |
| Open Datasets | Yes | The top-1 and top-5 accuracy results of post-training quantization are reported using four popular convolutional models pre-trained on the Image Net dataset. We use the Py Torch pretrained models for all experiments. [...] MS COCO dataset is used for evaluation. |
| Dataset Splits | Yes | The pre-trained models are trained on 80k training images and 35k of validation images (trainval35k), and is evaluated on the remaining 5k validation images (minival). |
| Hardware Specification | No | The paper mentions running experiments on "GPU" and "TPU" but does not provide specific details such as model numbers, memory, or processor types for the hardware used. |
| Software Dependencies | No | The paper mentions using "Py Torch pretrained models" and "mmdetection1 toolbox" but does not specify version numbers for these software components, which is required for reproducibility. |
| Experiment Setup | Yes | We quantize all layers into 4bit except the first layer and the final output layers which are quantized to 8bit. Activations are quantized into 8bit. The experiments are conducted using mmdetection1 toolbox. [...] Input images are resized to 800 pixels in the shorter edge. |