Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Domain-RAG: Retrieval-Guided Compositional Image Generation for Cross-Domain Few-Shot Object Detection

Authors: Yu Li, Xingyu Qiu, Yuqian Fu, Jie Chen, Tianwen Qian, Xu Zheng, Danda Pani Paudel, Yanwei Fu, Xuanjing Huang, Luc V Gool, Yu-Gang Jiang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show consistent improvements over strong baselines and establish new state-of-the-art results. The source code and instructions are available at https://github.com/Li Yu0524/Domain-RAG. We validate Domain-RAG on three various tasks that address few-shot object detection with domain shifts: CD-FSOD, remote sensing FSOD (RS-FSOD), and camouflaged FSOD. In all tasks, our method consistently improves a strong baseline by an average of +7.3, +1.1, and +2.1 m AP under the lowest-shot setting, achieving new state-of-the-art (SOTA) performance.
Researcher Affiliation Academia 1Fudan University 2INSAIT, Sofia University St. Kliment Ohridski 3Fuzhou University 4East China Normal University 5HKUST(GZ)
Pseudocode No The paper describes the methodology in detail across sections 3.1, 3.2, and 3.3 using natural language and conceptual diagrams (Figure 2), but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' block with structured steps.
Open Source Code Yes The source code and instructions are available at https://github.com/Li Yu0524/Domain-RAG.
Open Datasets Yes We conduct experiments on three FSOD tasks with domain shifts: 1) CD-FSOD: Following the CD-Vi TO benchmark [10], we evaluate on six diverse target domains: Ar Tax Or [6] (photorealistic), Clipart1k [17] (cartoon), DIOR [26] (aerial), Deep Fish [46] (underwater), NEU-DET [47] (industrial), and UODD [18] (underwater). 2) Remote Sensing FSOD (RS-FSOD): In addition to DIOR, we include NWPU VHR-10 [41], a popular remote sensing dataset for FSOD. 3) Camouflaged FSOD: We also test on CAMO-FS [39], a recent dataset with 47 categories where objects are deliberately camouflaged into the background. [...] To enable retrieval, we use COCO [31] as the database Dbase, serving as a gallery of candidate backgrounds.
Dataset Splits Yes For each task, we follow the standard dataset splits and evaluation protocols: 1/5/10-shot for CD-FSOD, 3/5/10/20-shot for RS-FSOD, and 1/2/3/5-shot for Camouflaged FSOD.
Hardware Specification Yes All experiments are run on four Tesla V100 GPUs or eight 5880 Ada GPUs, or a single A800 GPU.
Software Dependencies No The paper mentions several software components like 'Grounding DINO [33] with Swin-Transformer [35]', 'Adam W [36]', 'Flux-Redux model[23]', 'Flux-Fill', and 'La Ma inpainting [49]', but it does not specify explicit version numbers for these software libraries or tools within the main text.
Experiment Setup Yes For the retrieval stage, the hyper parameters are set to m = 100, n = 5, throughout all experiments. For the background generation stage, fusion hyper parameters λ1 and λ2 are set to 1.0 and 0.8 respectively. We fine-tune the model for 30 epochs by default, but reduce to 5 for faster-converging datasets like Clipart1k and Deep Fish. We use Adam W [36] with learning rate and weight decay set to 1 10 4, and we scale the backbone s learning rate by 0.1.