Post-Training Quantization for Vision Transformer

Authors: Zhenhua Liu, Yunhe Wang, Kai Han, Wei Zhang, Siwei Ma, Wen Gao

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The effectiveness of the proposed method is verified on several benchmark models and datasets, which outperforms the state-of-the-art posttraining quantization algorithms. For instance, we can obtain an 81.29% top-1 accuracy using Dei T-B model on Image Net dataset with about 8-bit quantization.
Researcher Affiliation Collaboration Zhenhua Liu1,2, Yunhe Wang2 , Kai Han2, Wei Zhang2, Siwei Ma1,3, Wen Gao1,3 1School of Electronic Engineering and Computer Science, Peking University 2 Huawei Noah s Ark Lab 3Peng Cheng Laboratory
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes Code will be available at https://gitee.com/mindspore/models/tree/master/research/cv/VTPTQ.
Open Datasets Yes For image classification, the CIFAR-10, CIFAR-100 and ILSVRC-2012 Image Net (we refer to it as Image Net in what follows) datasets are utilized to evaluate the quantization performance. ... For object detection task, the COCO2017 dataset is utilized to evaluate the quantization performance, which contains 118K training images and 5K validation images.
Dataset Splits Yes The CIFAR-10 dataset consists of 50K training images and 10K test images... CIFAR-100 dataset also contains 50K training images and 10K test images... Image Net dataset contains 1.2 million training images and 50K validation images... COCO2017 dataset ... contains 118K training images and 5K validation images.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory, or detailed computer specifications) used for running the experiments were provided. The paper only mentions 'High-Performance Computing Platform of Peking University' in acknowledgements, which is not specific enough.
Software Dependencies No The paper mentions 'MindSpore' in the code repository link, but does not provide specific version numbers for MindSpore or any other key software dependencies (e.g., Python, PyTorch, TensorFlow, CUDA) used in the experiments.
Experiment Setup Yes We randomly select 100 images for CIFAR-10 and CIFAR-100 dataset and 1000 images for Image Net and COCO2017 dataset from the training dataset as the calibration dataset. For the hyper-parameter, α and β are set to 0.5 and 1.2 for all the experiments. The maximum iteration is set to 20 if not mentioned specifically. For mixed-precision, we utilize {4,5,6,7,8} and {6,7,8,9,10} bits while the target bit-width are 6 bit and 8 bit, respectively.