Multi-Scale Representations by Varying Window Attention for Semantic Segmentation

Authors: Haotian Yan, Ming Wu, Chuang Zhang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 EXPERIMENTS Experiments are conducted on three public datasets including Cityscapes, ADE20K, and COCOStuff-164K (See D.2 for more information). The experiment protocols are the same as the compared method s official repository. For ablation studies, we choose the Swin-Base backbone as the testbed and use the same protocols as Swin-Uper Net (See D.3 for more information).
Researcher Affiliation Academia Haotian Yan , Ming Wu & Chuang Zhang Artificial Intelligence School Beijing University of Posts and Telecommunications, China {yanhaotian,wuming,zhangchuang}@bupt.edu.cn
Pseudocode No The paper describes the proposed methods using mathematical formulas and text, but it does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The code and model will be available at https://github.com/yan-hao-tian/vw
Open Datasets Yes Experiments are conducted on three public datasets including Cityscapes, ADE20K, and COCOStuff-164K (See D.2 for more information). D.2 DETAILS OF DATASET: Cityscapes is an urban scene parsing dataset that contains 5, 000 fine-annotated images... ADE20K is a challenging dataset in scene parsing... COCOStuff-164K is a very challenging benchmark.
Dataset Splits Yes D.2 DETAILS OF DATASET: Cityscapes is an urban scene parsing dataset that contains 5, 000 fine-annotated images captured from 50 cities with 19 semantic classes. There are 2, 975 images divided into a training set, 500 images divided into a validation set, and 1, 525 images divided into a testing set. ADE20K is a challenging dataset in scene parsing. It consists of a training set of 20, 210 images with 150 categories, a testing set of 3, 352 images, and a validation set of 2, 000 images.
Hardware Specification Yes The computing server on which all experiments are run has 16 Tesla V100 GPU cards.
Software Dependencies No The paper mentions PyTorch, MMSegmentation, and Detectron2 as software used, but it does not specify any version numbers for these software components.
Experiment Setup Yes The experiment protocols are the same as the compared method s official repository. For ablation studies, we choose the Swin-Base backbone as the testbed and use the same protocols as Swin-Uper Net (See D.3 for more information). All methods are trained for 80000 iterations, and evaluated on Cityscapes as well as ADE20K. The input size is 768/769 x 768/769 for Cityscapes, and 512 x 512 for ADE20K.