Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement
Authors: Yansong Peng, Hebei Li, Peixi Wu, Yueyi Zhang, Xiaoyan Sun, Feng Wu
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on the COCO dataset (Lin et al., 2014a) demonstrate that D-FINE achieves state-of-the-art performance in real-time object detection, surpassing existing models in accuracy and efficiency. D-FINE-L and D-FINE-X achieve 54.0% and 55.8% AP on the COCO dataset at 124 / 78 FPS on an NVIDIA T4 GPU. |
| Researcher Affiliation | Academia | 1University of Science and Technology of China 2Institute of Artificial Intelligence, Hefei Comprehensive National Science Center EMAIL EMAIL |
| Pseudocode | No | The paper describes methods (FDR, GO-LSD) mathematically and textually (e.g., equations 2, 3, 5, 6) and uses figures to illustrate processes, but it does not include a distinct section or figure labeled as 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | Our code and models: https://github.com/Peterande/D-FINE. |
| Open Datasets | Yes | Experimental results on the COCO dataset (Lin et al., 2014a) demonstrate that D-FINE achieves state-of-the-art performance... We further pretrain D-FINE and YOLOv10 on the Objects365 dataset (Shao et et al., 2019), before finetuning them on COCO. |
| Dataset Splits | Yes | We use the standard COCO2017 (Lin et al., 2014b) data splitting policy, training on COCO train2017, and evaluating on COCO val2017. |
| Hardware Specification | Yes | We measure end-to-end latency using Tensor RT FP16 on an NVIDIA T4 GPU. ... The baseline model achieves an AP of 53.0%, with a training time of 29 minutes per epoch and memory usage of 8552 MB on four NVIDIA RTX 4090 GPUs. |
| Software Dependencies | No | The paper mentions 'Tensor RT FP16' but does not provide a specific version number. It also mentions 'Adam W optimizer' but without a version or full software stack details. Therefore, explicit versioned software dependencies are not provided. |
| Experiment Setup | Yes | Table 6 summarizes the hyperparameter configurations for the D-FINE models. All variants use HGNet V2 backbones pretrained on Image Net (Cui et al., 2021; Russakovsky et al., 2015) and the Adam W optimizer. D-FINE-X is set with an embedding dimension of 384 and a feedforward dimension of 2048, while the other models use 256 and 1024, respectively. The D-FINE-X and D-FINE-L have 6 decoder layers... The base learning rate and weight decay for D-FINE-X and D-FINE-L are 2.5e-4 and 1.25e-4, respectively... The total batch size is 32 across all variants. Training schedules include 72 epochs with advanced augmentation... followed by 2 epochs without advanced augmentation for D-FINE-X and D-FINE-L... |