reproducibilityindex.ai

Per-Pixel Classification is Not All You Need for Semantic Segmentation

Authors: Bowen Cheng, Alex Schwing, Alexander Kirillov

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Mask Former on ﬁve semantic segmentation datasets with various numbers of categories: Cityscapes [13] (19 classes), Mapillary Vistas [31] (65 classes), ADE20K [49] (150 classes), COCOStuff-10K [2] (171 classes), and ADE20K-Full [49] (847 classes). Mask Former achieves the new state-of-the-art on ADE20K (55.6 m Io U) with Swin-Transformer [27] backbone, outperforming a per-pixel classiﬁcation model [27] with the same backbone by 2.1 m Io U, while being more efﬁcient (10% reduction in parameters and 40% reduction in FLOPs).
Researcher Affiliation	Collaboration	1Facebook AI Research (FAIR) 2University of Illinois at Urbana-Champaign (UIUC)
Pseudocode	No	The paper includes diagrams and descriptions of the model architecture but does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper provides a link to a 'Project page' (https://bowenc0221.github.io/maskformer) but does not explicitly state that source code for the methodology is provided on this page or give a direct link to a code repository.
Open Datasets	Yes	We study Mask Former using four widely used semantic segmentation datasets: ADE20K [49] (150 classes) from the Scene Parse150 challenge [48], COCO-Stuff-10K [2] (171 classes), Cityscapes [13] (19 classes), and Mapillary Vistas [31] (65 classes). In addition, we use the ADE20K-Full [49] (847 classes) dataset annotated in an open vocabulary setting... For panotic segmenation evaluation we use COCO [26, 2, 22] (80 things and 53 stuff categories) and ADE20K-Panoptic [49, 22] (100 things and 50 stuff categories).
Dataset Splits	Yes	We evaluate the models on ADE20K val with 150 categories.
Hardware Specification	Yes	All models are trained with 8 V100 GPUs.
Software Dependencies	No	The paper mentions using Detectron2 [42] but does not provide specific version numbers for it or any other key software dependencies.
Experiment Setup	Yes	More speciﬁcally, we use Adam W [29] and the poly [6] learning rate schedule with an initial learning rate of 10 4 and a weight decay of 10 4 for Res Net [20] backbones, and an initial learning rate of 6 10 5 and a weight decay of 10 2 for Swin-Transformer [27] backbones. (...) For the ADE20K dataset, if not stated otherwise, we use a crop size of 512 512, a batch size of 16 and train all models for 160k iterations.