AQ-DETR: Low-Bit Quantized Detection Transformer with Auxiliary Queries
Authors: Runqi Wang, Huixin Sun, Linlin Yang, Shaohui Lin, Chuanjian Liu, Yan Gao, Yao Hu, Baochang Zhang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through our extensive experiments on large-scale open datasets, the performance of the 4-bit quantization of DETR and Deformable DETR models is comparable to full-precision counterparts. |
| Researcher Affiliation | Collaboration | Runqi Wang1, Huixin Sun1, Linlin Yang2*, Shaohui Lin3, Chuanjian Liu4, Yan Gao5, Yao Hu5, Baochang Zhang1,6,7 1ASEE, EIE and Hangzhou Research Institute, Beihang University; 2State Key Laboratory of Media Convergence and Communication, Communication University of China; 3School of Computer Science and Technology, East China Normal University; 4Huawei Noah s Ark Lab; 5Xiaohongshu Inc; 6Zhongguancun Laboratory; 7Nanchang Institute of Technology |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described, nor does it include a link to a code repository. |
| Open Datasets | Yes | We conduct experiments on the PASCAL VOC dataset (Everingham et al. 2010) and the COCO 2017 object detection (Lin et al. 2014). |
| Dataset Splits | Yes | We use VOC trainval2012, and VOC trainval2007 for training, and VOC test2007 set for evaluation. For COCO 2017 object detection, we use its standard train and test split |
| Hardware Specification | Yes | We run the experiments on 8 NVIDIA Tesla A100 GPUs with 40 GB memory |
| Software Dependencies | No | The paper mentions 'Py Torch (Paszke et al. 2017) is used for implementing our baseline and AQ-DETR' but does not specify specific version numbers for PyTorch or other software dependencies. |
| Experiment Setup | Yes | We use Adam (Loshchilov and Hutter 2017) with a batch size of 8 and an initial learning rate of 2e 4. The scale factor of LLD λdis is set to 0.1 and K = 6 in Eq. 3 as default, and the cross-entropy is chosen as the distillation loss. The quantized DETR is trained for 300 epochs and the learning rate is multiplied by 0.1 at the 200-th epoch. we use 100 object queries and 500 auxiliary queries for training, and 100 object queries for testing in the DETR framework. Following the Deformable DETR, the quantized Deformable DETR is trained for 12 epochs, and the learning rate is multiplied by 0.1 at the 11-th epoch on both the VOC and COCO datasets. We use 300 object queries and 1500 auxiliary queries for training, and 300 object queries for testing. |