ERQ: Error Reduction for Post-Training Quantization of Vision Transformers
Authors: Yunshan Zhong, Jiawei Hu, You Huang, Yuxin Zhang, Rongrong Ji
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results attest to the effectiveness of our approach. Notably, ERQ surpasses the state-of-the-art GPTQ by 22.36% in accuracy for W3A4 Vi T-S. |
| Researcher Affiliation | Academia | 1Institute of Artificial Intelligence, Xiamen University. 2Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University. 3Department of Artificial Intelligence, School of Informatics, Xiamen University. 4Peng Cheng Laboratory. |
| Pseudocode | Yes | Algorithm 1 Weight Quantization Error Reduction |
| Open Source Code | No | The paper does not provide explicit statements about the release of its own source code for ERQ, nor does it include a link to a code repository for its method. |
| Open Datasets | Yes | We conduct extensive experiments on image classification, object detection, and instance segmentation. For the image classification task, we evaluate the ERQ on the Image Net dataset (Russakovsky et al., 2015), with different Vi T variants including Vi T (Dosovitskiy et al., 2021), Dei T (Touvron et al., 2021), and Swin (Liu et al., 2021a). As for object detection and instance segmentation tasks, we evaluate ERQ on the COCO dataset (Lin et al., 2014) with Mask R-CNN (He et al., 2017) and Cascade Mask R-CNN (Cai & Vasconcelos, 2018), both using Swin (Liu et al., 2021a) as their backbone. |
| Dataset Splits | Yes | Consistent with previous study (Li et al., 2023), we randomly select 32 images each from the Image Net and 1 image from the COCO dataset. The quantization parameters are determined by forwarding the calibration datasets, and the reparameterization technique is used to initialize the activation quantizer as in (Li et al., 2023). |
| Hardware Specification | Yes | All experiments are implemented using Py Torch framework (Paszke et al., 2019) with a single NVIDIA 3090 GPU and an Intel Xeon 4214R CPU. |
| Software Dependencies | No | The paper mentions 'Py Torch framework (Paszke et al., 2019)' and 'pulp (a CPU-only LP modeler written in Python)' but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | In our experiments, the k and maximum iteration of Rounding Refinement are set to 1 and 100, respectively. We use the pulp (a CPU-only LP modeler written in Python) to solve the MIPQ. For the image classification task, we set λ1 = λ2 = 1e4 for Vi T, λ1 = λ2 = 1e3 for Dei T-T, λ1 = λ2 = 1e4 for Dei T-S and Dei T-B, and λ1 = λ2 = 1e4 for Swin. For detection and segmentation tasks, we set λ1 = λ2 = 1e5 for all models. |