Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

OPMapper: Enhancing Open-Vocabulary Semantic Segmentation with Multi-Guidance Information

Authors: Xuehui Wang, Chongjie Si, Xue Yang, Yuzhi Zhao, Wenhai Wang, Xiaokang Yang, Wei Shen

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate its effectiveness, yielding significant improvements across 8 open-vocabulary segmentation benchmarks. We evaluate our method on 8 widely used semantic segmentation benchmarks, ensuring a comprehensive assessment. The datasets include PASCAL VOC20 (VOC21, with one additional background class) [18], PASCAL Context59 (Context60, with background) [39], COCO Object (Object) [2], COCO-Stuff (Stuff) [2], Cityscapes (City) [13], and ADE20K (ADE) [55]. Mean Intersection-over Union (m Io U) is used as the evaluation metric across all datasets. Our mapper is trained on the COCO-Stuff dataset, which contains 118k densely annotated images spanning 171 categories. We then integrate our lightweight mapper into two types of existing methods: training-based methods, including SAN [49], SCAN [17] and CAT-Seg [10], and training-free methods, including Proxy CLIP [28], Clear CLIP [27], SCLIP [45], LPOSS [42], CASS [26] and Mask CLIP [15].
Researcher Affiliation	Academia	1Mo E Key Lab of Artificial Intelligence, AI Institute, School of Computer Science, SJTU 2School of Automation and Intelligent Sensing, SJTU 3City University of Hong Kong 4MMLab, Chinese University of Hong Kong EMAIL
Pseudocode	No	The paper describes methods using mathematical formulations (e.g., equations 1-16) and architectural diagrams, but it does not include a clearly labeled pseudocode block or algorithm.
Open Source Code	No	Our code will be released on Git Hub upon acceptance of the paper (if accepted).
Open Datasets	Yes	We evaluate our method on 8 widely used semantic segmentation benchmarks, ensuring a comprehensive assessment. The datasets include PASCAL VOC20 (VOC21, with one additional background class) [18], PASCAL Context59 (Context60, with background) [39], COCO Object (Object) [2], COCO-Stuff (Stuff) [2], Cityscapes (City) [13], and ADE20K (ADE) [55]. Mean Intersection-over Union (m Io U) is used as the evaluation metric across all datasets.
Dataset Splits	Yes	We evaluate our method on 8 widely used semantic segmentation benchmarks, ensuring a comprehensive assessment. The datasets include PASCAL VOC20 (VOC21, with one additional background class) [18], PASCAL Context59 (Context60, with background) [39], COCO Object (Object) [2], COCO-Stuff (Stuff) [2], Cityscapes (City) [13], and ADE20K (ADE) [55]. ... Our mapper is trained on the COCO-Stuff dataset, which contains 118k densely annotated images spanning 171 categories. ... All experiments were conducted using MMSegmentation [11]. During training, COCO-Stuff images were resized to have a 384-pixel short edge, followed by proportional scaling and random cropping to 384 x 384.
Hardware Specification	Yes	Our mapper is highly memory-efficient, requiring 2GB of GPU memory per batch, and was trained on four NVIDIA A100 GPUs (batch size: 8 per GPU) for 80,000 iterations.
Software Dependencies	No	All experiments were conducted using MMSegmentation [11].
Experiment Setup	Yes	We used the Adam W optimizer with learning rates of 5 10 4 for the SAA and 5 10 5 for the CAI. Our mapper is highly memory-efficient, requiring 2GB of GPU memory per batch, and was trained on four NVIDIA A100 GPUs (batch size: 8 per GPU) for 80,000 iterations, completing in 3 4 hours. ... w KL, w DICE, and w BCE are set to 10, 5, 10, respectively.