Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Multi-Scale Representations by Varying Window Attention for Semantic Segmentation

Authors: Haotian Yan, Ming Wu, Chuang Zhang

ICLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 EXPERIMENTS Experiments are conducted on three public datasets including Cityscapes, ADE20K, and COCOStuff-164K (See D.2 for more information). The experiment protocols are the same as the compared method s official repository. For ablation studies, we choose the Swin-Base backbone as the testbed and use the same protocols as Swin-Uper Net (See D.3 for more information).
Researcher Affiliation	Academia	Haotian Yan , Ming Wu & Chuang Zhang Artificial Intelligence School Beijing University of Posts and Telecommunications, China EMAIL
Pseudocode	No	The paper describes the proposed methods using mathematical formulas and text, but it does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code and model will be available at https://github.com/yan-hao-tian/vw
Open Datasets	Yes	Experiments are conducted on three public datasets including Cityscapes, ADE20K, and COCOStuff-164K (See D.2 for more information). D.2 DETAILS OF DATASET: Cityscapes is an urban scene parsing dataset that contains 5, 000 fine-annotated images... ADE20K is a challenging dataset in scene parsing... COCOStuff-164K is a very challenging benchmark.
Dataset Splits	Yes	D.2 DETAILS OF DATASET: Cityscapes is an urban scene parsing dataset that contains 5, 000 fine-annotated images captured from 50 cities with 19 semantic classes. There are 2, 975 images divided into a training set, 500 images divided into a validation set, and 1, 525 images divided into a testing set. ADE20K is a challenging dataset in scene parsing. It consists of a training set of 20, 210 images with 150 categories, a testing set of 3, 352 images, and a validation set of 2, 000 images.
Hardware Specification	Yes	The computing server on which all experiments are run has 16 Tesla V100 GPU cards.
Software Dependencies	No	The paper mentions PyTorch, MMSegmentation, and Detectron2 as software used, but it does not specify any version numbers for these software components.
Experiment Setup	Yes	The experiment protocols are the same as the compared method s official repository. For ablation studies, we choose the Swin-Base backbone as the testbed and use the same protocols as Swin-Uper Net (See D.3 for more information). All methods are trained for 80000 iterations, and evaluated on Cityscapes as well as ADE20K. The input size is 768/769 x 768/769 for Cityscapes, and 512 x 512 for ADE20K.