Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

RUNA: Object-Level Out-of-Distribution Detection via Regional Uncertainty Alignment of Multimodal Representations

Authors: Bin Zhang, Jinggang Chen, Xiaoyang Qu, Guokuan Li, Kai Lu, Jiguang Wan, Jing Xiao, Jianzong Wang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that RUNA substantially surpasses state-of-the-art methods in object-level OOD detection, particularly in challenging scenarios with diverse and complex object instances. ... Experiments Experimental Settings Datasets and metrics. We use PASCAL-VOC(Everingham et al. 2010) and BDD-100K(Yu et al. 2020) as ID datasets and evaluate on two OOD datasets sourced from MSCOCO(Lin et al. 2014) and Open Images(Kuznetsova et al. 2020), ensuring no label overlap with ID datasets. ... Ablation Study Ablation on the dual encoder and ID fine-tuning. We examine the impact of the dual encoder and ID fine-tuning components within our object-level OOD detection framework. ... Ablation on the number of fine-tuning samples (shots). We examine the effect of changing the number of samples used for model fine-tuning. The evaluation results are presented in Figure 5. ... Ablation on different backbones of the visual encoder. We assess the influence of various Vi T backbones on the performance of the CLIP model, as illustrated in Table 3.
Researcher Affiliation	Collaboration	1Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China 2Ping An Technology (Shenzhen) Co., Ltd, Shenzhen, China
Pseudocode	No	The paper describes the methodology using prose and mathematical equations, but it does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper uses the Detectron2 platform (Wu et al. 2019) and cites its repository, but there is no explicit statement about releasing the source code for the methodology described in this paper.
Open Datasets	Yes	We use PASCAL-VOC(Everingham et al. 2010) and BDD-100K(Yu et al. 2020) as ID datasets and evaluate on two OOD datasets sourced from MSCOCO(Lin et al. 2014) and Open Images(Kuznetsova et al. 2020), ensuring no label overlap with ID datasets.
Dataset Splits	No	The paper states, 'For few-shot learning, we perform fine-tuning using 10-shot samples,' but it does not provide specific training, validation, or test splits (percentages or counts) for the main datasets like PASCAL-VOC, BDD-100K, MSCOCO, or Open Images.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies	No	The paper mentions using 'Detectron2 platform', 'Faster R-CNN', and 'CLIP (VIT-B/16)', but it does not provide specific version numbers for these software components or any other libraries.
Experiment Setup	Yes	For ID discriminative fine-tuning, we employ a batch size of 256 and use the Adam W optimizer, conducting fine-tuning over 100 epochs with a base learning rate of 5 x 10^-6. ... We set the dual encoder s fusion coefficient λ to 0.5 and the blur radius R of Gaussian Blur to 1.