DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
Authors: Shilong Liu, Feng Li, Hao Zhang, Xiao Yang, Xianbiao Qi, Hang Su, Jun Zhu, Lei Zhang
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also conducted extensive experiments to confirm our analysis and verify the effectiveness of our methods. Code is available at https://github.com/IDEA-opensource/ DAB-DETR. (...) Our results demonstrate that DAB-DETR attains the best performance among DETR-like architectures under the same setting on the COCO object detection benchmark. The proposed method can achieve 45.7% AP when using a single Res Net-50 (He et al., 2016) model as backbone for training 50 epochs. |
| Researcher Affiliation | Collaboration | Shilong Liu1,2 , Feng Li2,3, Hao Zhang2,3, Xiao Yang1, Xianbiao Qi2, Hang Su1,4, Jun Zhu1,4 , Lei Zhang2 1Dept. of Comp. Sci. and Tech., BNRist Center, State Key Lab for Intell. Tech. & Sys., Institute for AI, Tsinghua-Bosch Joint Center for ML, Tsinghua University. 2International Digital Economy Academy (IDEA). 3Hong Kong University of Science and Technology. 4Peng Cheng Laboratory, Shenzhen, Guangdong, China. |
| Pseudocode | No | The paper does not contain any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/IDEA-opensource/ DAB-DETR. (...) We have released the source code on Github at https://github.com/IDEA-opensource/DAB-DETR with all materials that are needed to reproduce our results. |
| Open Datasets | Yes | Dataset. We conduct the experiments on the COCO (Lin et al., 2014) object detection dataset. All models are trained on the train2017 split and evaluated on the val2017 split. |
| Dataset Splits | Yes | All models are trained on the train2017 split and evaluated on the val2017 split. |
| Hardware Specification | Yes | All models are trained on Nvidia A100 GPU. |
| Software Dependencies | No | The paper mentions using components like ResNet, Transformer, AdamW, PReLU, and focal loss, but it does not specify version numbers for these or any other software libraries or frameworks (e.g., PyTorch, TensorFlow, CUDA). |
| Experiment Setup | Yes | All models are trained on 16 GPUs with 1 image per GPU and Adam W (Loshchilov & Hutter, 2018) is used for training with weight decay 10 4. The learning rates for backbone and other modules are set to 10 5 and 10 4, respectively. We train our models for 50 epochs and drop the learning rate by 0.1 after 40 epochs. (...) We also use focal loss (Lin et al., 2020) with α = 0.25, γ = 2 for classification. (...) L1 loss with coefficient 5.0 and GIOU loss (Rezatofighi et al., 2019) with coefficient 2.0 are consistent in both the matching and the final loss calculation procedures. |