Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Leveraging Hallucinations to Reduce Manual Prompt Dependency in Promptable Segmentation
Authors: Jian Hu, Jiayi Lin, Junchi Yan, Shaogang Gong
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on 5 benchmarks demonstrate the effectiveness of Pro Ma C. |
| Researcher Affiliation | Academia | 1School of Electronic Engineering and Computer Science, Queen Mary University of London 2Dept. of CSE & School of AI & Moe Key Lab of AI, Shanghai Jiao Tong University |
| Pseudocode | Yes | Algorithm 1 Algorithm of our Pro Ma C |
| Open Source Code | Yes | Code given in https://lwpyh.github.io/Pro Ma C/. |
| Open Datasets | Yes | We evaluated Pro Ma C on three representative datasets: CHAMELEON [50], CAMO [30], and COD10K [14]... MIS task... Colon DB [51] and Kvasir [25] for polyp image segmentation, and ISIC [10] for skin lesion segmentation... TOD task, we evaluated Pro Ma C on the GSD [34] and Trans10K-hard [56] datasets... OVS task... validation splits of PASCAL VOC (21 classes) [12, 11], Pascal Context (59 classes) [42], and COCO-Object (80 classes) [3]... What s Up spatial reasoning dataset [27]. |
| Dataset Splits | Yes | Specifically, we tested it on the validation splits of PASCAL VOC (21 classes) [12, 11], Pascal Context (59 classes) [42], and COCO-Object (80 classes) [3] |
| Hardware Specification | Yes | Our experiment is conducted on a single NVIDIA A100 GPU. |
| Software Dependencies | No | The paper mentions specific models and versions like "LLa VA-1.5-13B", "CS-Vi T-B/16", "stablediffusion-2-inpainting", and "Vi T-H/16 version of SAM". However, it does not provide specific version numbers for underlying software dependencies such as Python, PyTorch, CUDA, or operating systems. |
| Experiment Setup | Yes | All tasks are optimized using training-free test-time adaptation, with each task iterating for four epochs, except for the polyp image segmentation task, which undergoes six epochs. ... Following [54], we set α = 1 in all tasks. ... where w is a hyperparameter, which we have assigned a value of 0.3. |