Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Towards Single-Source Domain Generalized Object Detection via Causal Visual Prompts

Authors: Chen Li, Huiying Xu, Changxin Gao, Zeyu Wang, Yun Liu, Xinzhong Zhu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Cauvis achieves state-of-the-art performance with 15.9 31.4% gains over existing domain generalization methods on SDGOD datasets, while exhibiting significant robustness advantages in complex interference environments. 5 Experiments 5.1 Settings Datasets. Our experimental datasets primarily follow the SDGOD benchmark [3], encompassing five distinct weather conditions: Day-Clear (DC), Day-Foggy (DF), Dusk-Rainy (DR), Night-Rainy (NR), and Night-Clear (NC). ...Table 2: Quantitative results (%) on target domain datasets. ... Table 3: Quantitative results (%) on Cityscapes-C (level-5). ... Table 4: Quantitative results (%) on BDD100K-C (level-5). ... Table 5: Detailed ablation study in Cauvis.
Researcher Affiliation	Academia	1 National Key Laboratory of Multispectral Information Intelligent Processing Technology, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, 2 Zhejiang Key Laboratory of Intelligent Education Technology and Application, Zhejiang Normal University, 3 Northwest Polytechnical University, 4 Nankai University. EMAIL
Pseudocode	No	The paper describes methods and processes in text and diagrams (Figure 3: Overview of the Cauvis) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Project Link: https://github.com/lichen1015/Cauvis Required codes were submitted in the supplemental.
Open Datasets	Yes	Datasets. Our experimental datasets primarily follow the SDGOD benchmark [3], encompassing five distinct weather conditions... To evaluate generalization, we extend evaluation to Cityscapes-C [40], a benchmark containing 15 corruption types... BDD100K-C. Table 4 shows that Cauvis also yields consistent improvements on a more diverse, real-world corruption suite.
Dataset Splits	Yes	Datasets. Our experimental datasets primarily follow the SDGOD benchmark [3], encompassing five distinct weather conditions: Day-Clear (DC), Day-Foggy (DF), Dusk-Rainy (DR), Night-Rainy (NR), and Night-Clear (NC). ... Models are trained for 12 epochs on the Day Clear and then evaluated on four generalization sets (Day Foggy, Dusk Rainy, Night Rainy, Night Clear).
Hardware Specification	Yes	The experiments were conducted on 8 NVIDIA RTX 4090 GPUs, and the batch size is 16 for DINO [23] and 64 for Faster RCNN.
Software Dependencies	Yes	Our software environment is Ubuntu 22.04, CUDA 12.1, cu DNN 8.8, Py Torch 2.2.0, MMCV 2.2.0, and MMDetection 3.3.0.
Experiment Setup	Yes	Implementation Details. Our model employs DINO [23] and Faster RCNN [7] as the detection head. ... and train all models for 12 epochs using the Adam W optimizer ( 1 10 4, β1 = 0.9, β2 = 0.999, weight decay 10 4). The base learning rate is set to 10 4, with linear projections for object query reference points and sampling offsets using a 0.1 reduced rate. ... the batch size is 16 for DINO [23] and 64 for Faster RCNN. The DINOv2 freezes all parameters. ... All experiment configurations are summarized in Table 8.