Multi-Scale Representations by Varying Window Attention for Semantic Segmentation
Authors: Haotian Yan, Ming Wu, Chuang Zhang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 EXPERIMENTS Experiments are conducted on three public datasets including Cityscapes, ADE20K, and COCOStuff-164K (See D.2 for more information). The experiment protocols are the same as the compared method s official repository. For ablation studies, we choose the Swin-Base backbone as the testbed and use the same protocols as Swin-Uper Net (See D.3 for more information). |
| Researcher Affiliation | Academia | Haotian Yan , Ming Wu & Chuang Zhang Artificial Intelligence School Beijing University of Posts and Telecommunications, China {yanhaotian,wuming,zhangchuang}@bupt.edu.cn |
| Pseudocode | No | The paper describes the proposed methods using mathematical formulas and text, but it does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code and model will be available at https://github.com/yan-hao-tian/vw |
| Open Datasets | Yes | Experiments are conducted on three public datasets including Cityscapes, ADE20K, and COCOStuff-164K (See D.2 for more information). D.2 DETAILS OF DATASET: Cityscapes is an urban scene parsing dataset that contains 5, 000 fine-annotated images... ADE20K is a challenging dataset in scene parsing... COCOStuff-164K is a very challenging benchmark. |
| Dataset Splits | Yes | D.2 DETAILS OF DATASET: Cityscapes is an urban scene parsing dataset that contains 5, 000 fine-annotated images captured from 50 cities with 19 semantic classes. There are 2, 975 images divided into a training set, 500 images divided into a validation set, and 1, 525 images divided into a testing set. ADE20K is a challenging dataset in scene parsing. It consists of a training set of 20, 210 images with 150 categories, a testing set of 3, 352 images, and a validation set of 2, 000 images. |
| Hardware Specification | Yes | The computing server on which all experiments are run has 16 Tesla V100 GPU cards. |
| Software Dependencies | No | The paper mentions PyTorch, MMSegmentation, and Detectron2 as software used, but it does not specify any version numbers for these software components. |
| Experiment Setup | Yes | The experiment protocols are the same as the compared method s official repository. For ablation studies, we choose the Swin-Base backbone as the testbed and use the same protocols as Swin-Uper Net (See D.3 for more information). All methods are trained for 80000 iterations, and evaluated on Cityscapes as well as ADE20K. The input size is 768/769 x 768/769 for Cityscapes, and 512 x 512 for ADE20K. |