Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
Authors: Yang Lin, Tianyu Zhang, Peiqin Sun, Zheng Li, Shuchang Zhou
IJCAI 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments on various transformer-based architectures and benchmarks show that our Fully Quantized Vision Transformer (FQ-Vi T) outperforms previous works while even using lower bitwidth on attention maps. For instance, we reach 84.89% top-1 accuracy with Vi T-L on Image Net and 50.8 m AP with Cascade Mask R-CNN (Swin S) on COCO. To our knowledge, we are the first to achieve lossless accuracy degradation ( 1%) on fully quantized vision transformers. |
| Researcher Affiliation | Industry | Yang Lin , Tianyu Zhang , Peiqin Sun , Zheng Li and Shuchang Zhou MEGVII Technology EMAIL, EMAIL |
| Pseudocode | No | The paper describes algorithms and formulas but does not include explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/megvii-research/ FQ-Vi T. |
| Open Datasets | Yes | We randomly sample 1000 training images from Image Net or COCO as the calibration data, and use the validation set to evaluate performance. Apart from special notes, we perform symmetric channel-wise quantization for weights and asymmetric layer-wise quantization for activations. For a fair comparison, the quantization for weights is fixed as Min Max. The hyperparameter K in Power-of-Two Factor is set to 3. Image Net [Krizhevsky et al., 2012] and COCO [Lin et al., 2014] |
| Dataset Splits | Yes | We randomly sample 1000 training images from Image Net or COCO as the calibration data, and use the validation set to evaluate performance. Apart from special notes, we perform symmetric channel-wise quantization for weights and asymmetric layer-wise quantization for activations. For a fair comparison, the quantization for weights is fixed as Min Max. The hyperparameter K in Power-of-Two Factor is set to 3. |
| Hardware Specification | No | The paper does not specify the hardware used for running the experiments. It only mentions general concepts like 'resource-constrained hardware devices' and 'floating-point units in the hardware'. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | Yes | We randomly sample 1000 training images from Image Net or COCO as the calibration data, and use the validation set to evaluate performance. Apart from special notes, we perform symmetric channel-wise quantization for weights and asymmetric layer-wise quantization for activations. For a fair comparison, the quantization for weights is fixed as Min Max. The hyperparameter K in Power-of-Two Factor is set to 3. |