reproducibilityindex.ai

Deformable DETR: Deformable Transformers for End-to-End Object Detection

Authors: Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, Jifeng Dai

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on the COCO benchmark demonstrate the effectiveness of our approach.
Researcher Affiliation	Collaboration	Xizhou Zhu1 , Weijie Su2 , Lewei Lu1, Bin Li2, Xiaogang Wang1,3, Jifeng Dai1 1Sense Time Research 2University of Science and Technology of China 3The Chinese University of Hong Kong
Pseudocode	No	The paper does not contain any section or figure explicitly labeled as "Pseudocode" or "Algorithm".
Open Source Code	Yes	Code is released at https:// github.com/fundamentalvision/Deformable-DETR.
Open Datasets	Yes	We conduct experiments on COCO 2017 dataset (Lin et al., 2014).
Dataset Splits	Yes	Our models are trained on the train set, and evaluated on the val set and test-dev set.
Hardware Specification	Yes	Run time is evaluated on NVIDIA Tesla V100 GPU.
Software Dependencies	No	The paper mentions using "Adam optimizer (Kingma & Ba, 2015)" but does not specify versions for other key software components, such as a deep learning framework (e.g., PyTorch, TensorFlow) or specific libraries.
Experiment Setup	Yes	M = 8 and K = 4 are set for deformable attentions by default. Parameters of the deformable Transformer encoder are shared among different feature levels. Other hyper-parameter setting and training strategy mainly follow DETR (Carion et al., 2020), except that Focal Loss (Lin et al., 2017b) with loss weight of 2 is used for bounding box classiﬁcation, and the number of object queries is increased from 100 to 300. By default, models are trained for 50 epochs and the learning rate is decayed at the 40-th epoch by a factor of 0.1. Following DETR(Carion et al., 2020), we train our models using Adam optimizer (Kingma & Ba, 2015) with base learning rate of 2 10 4, β1 = 0.9, β2 = 0.999, and weight decay of 10 4. Learning rates of the linear projections, used for predicting object query reference points and sampling offsets, are multiplied by a factor of 0.1.