Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Leveraging Hallucinations to Reduce Manual Prompt Dependency in Promptable Segmentation

Authors: Jian Hu, Jiayi Lin, Junchi Yan, Shaogang Gong

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on 5 benchmarks demonstrate the effectiveness of Pro Ma C.
Researcher Affiliation	Academia	1School of Electronic Engineering and Computer Science, Queen Mary University of London 2Dept. of CSE & School of AI & Moe Key Lab of AI, Shanghai Jiao Tong University
Pseudocode	Yes	Algorithm 1 Algorithm of our Pro Ma C
Open Source Code	Yes	Code given in https://lwpyh.github.io/Pro Ma C/.
Open Datasets	Yes	We evaluated Pro Ma C on three representative datasets: CHAMELEON [50], CAMO [30], and COD10K [14]... MIS task... Colon DB [51] and Kvasir [25] for polyp image segmentation, and ISIC [10] for skin lesion segmentation... TOD task, we evaluated Pro Ma C on the GSD [34] and Trans10K-hard [56] datasets... OVS task... validation splits of PASCAL VOC (21 classes) [12, 11], Pascal Context (59 classes) [42], and COCO-Object (80 classes) [3]... What s Up spatial reasoning dataset [27].
Dataset Splits	Yes	Specifically, we tested it on the validation splits of PASCAL VOC (21 classes) [12, 11], Pascal Context (59 classes) [42], and COCO-Object (80 classes) [3]
Hardware Specification	Yes	Our experiment is conducted on a single NVIDIA A100 GPU.
Software Dependencies	No	The paper mentions specific models and versions like "LLa VA-1.5-13B", "CS-Vi T-B/16", "stablediffusion-2-inpainting", and "Vi T-H/16 version of SAM". However, it does not provide specific version numbers for underlying software dependencies such as Python, PyTorch, CUDA, or operating systems.
Experiment Setup	Yes	All tasks are optimized using training-free test-time adaptation, with each task iterating for four epochs, except for the polyp image segmentation task, which undergoes six epochs. ... Following [54], we set α = 1 in all tasks. ... where w is a hyperparameter, which we have assigned a value of 0.3.