Cal-DETR: Calibrated Detection Transformer

Authors: Muhammad Akhtar Munir, Salman H. Khan, Muhammad Haris Khan, Mohsen Ali, Fahad Shahbaz Khan

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Lastly, we conduct extensive experiments across three in-domain and four out-domain scenarios. Results corroborate the effectiveness of Cal-DETR against the competing train-time methods in calibrating both in-domain and out-domain detections while maintaining or even improving the detection performance.
Researcher Affiliation Academia 1Mohamed bin Zayed University of AI, 2Information Technology University, 3Australian National University, 4Linköping University
Pseudocode No The paper describes its methods through narrative and mathematical equations, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Our codebase and pre-trained models can be accessed at https://github.com/akhtarvision/cal-detr.
Open Datasets Yes Datasets: To perform experiments, we utilize various in-domain and out-domain benchmark datasets. Details are as follows: MS-COCO [27], Cityscapes [4], Foggy Cityscapes [39], BDD100k [48], Sim10k [17].
Dataset Splits Yes MS-COCO [27] is a large-scale object detection dataset for real images containing 80 classes. It splits into 118K train images, 41K test images, and 5K validation images. The train set (train2017) is used for training while the validation set (val2017) is utilized for evaluation.
Hardware Specification No The paper mentions that 'The computational resources were provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS)... and by the Berzelius resource, provided by the Knut and Alice Wallenberg Foundation at the National Supercomputer Center.' However, it does not specify concrete hardware details such as exact GPU or CPU models, processor types, or memory amounts.
Software Dependencies No The paper states, 'For more implementation details, we refer readers to D-DETR [53], UP-DETR [5] and DINO [51],' implying the use of their frameworks. However, it does not explicitly list specific software dependencies with version numbers, such as Python, PyTorch, or CUDA versions.
Experiment Setup Yes We find empirically over the validation set and use the logit mixing parameters α1 = 0.9, α2 = 0.1 and λ = 0.5 (see Sec. 4.2). For the classification loss, the focal loss is incorporated, and for localization, losses are generalized Io U and L1 as described in [26, 2].