Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation
Authors: Yiming Cui, Linjie Yang, Haichao Yu
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show the superior performance of our approach combined with a wide range of DETR-based models on MS COCO (Lin et al., 2014), City Scapes (Cordts et al., 2016) and You Tube-VIS (Yang et al., 2019b) benchmarks with multiple tasks, including object detection, instance segmentation, and panoptic segmentation. |
| Researcher Affiliation | Collaboration | 1Department of Electrical and Computer Engineering, University of Florida, Gainesville, USA 2Byte Dance Inc., San Jose, USA. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper. |
| Open Datasets | Yes | For the object detection task, we use MS COCO benchmark (Lin et al., 2014) for evaluation, which contains 118, 287 images for training and 5, 000 for validation. |
| Dataset Splits | Yes | For the object detection task, we use MS COCO benchmark (Lin et al., 2014) for evaluation, which contains 118, 287 images for training and 5, 000 for validation. |
| Hardware Specification | Yes | The training time is based on 8 NVIDIA A100 GPUs and the inference FPS is tested on a single TITAN RTX GPU. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | The query ratio r used to generate the combination coefficients is set to 4 by default. β is set to be 1. θ is implemented as a two-layer MLP with Re LU as nonlinear activations. The output size of its first layer is 512, and that of the second layer is the length of W D in corresponding models. For detection models, we use 300 modulated queries and 1200 basic queries if not specified otherwise. |