reproducibilityindex.ai

SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation

Authors: Meng-Hao Guo, Cheng-Ze Lu, Qibin Hou, Zhengning Liu, Ming-Ming Cheng, Shi-min Hu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our Seg Ne Xt significantly improves the performance of previous state-of-the-art methods on popular benchmarks, including ADE20K, Cityscapes, COCO-Stuff, Pascal VOC, Pascal Context, and i SAID. Notably, Seg Ne Xt outperforms Efficient Net-L2 w/ NAS-FPN and achieves 90.6% m Io U on the Pascal VOC 2012 test leaderboard using only 1/10 parameters of it. On average, Seg Ne Xt achieves about 2.0% m Io U improvements compared to the state-of-the-art methods on the ADE20K datasets with the same or fewer computations.
Researcher Affiliation	Collaboration	Meng-Hao Guo1 Cheng-Ze Lu2 Qibin Hou2 Zheng-Ning Liu3 Ming-Ming Cheng2 Shi-Min Hu1 1BNRist, Department of Computer Science and Technology, Tsinghua University 2TMCC, CS, Nankai University 3Fitten Tech, Beijing, China
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Project page: https://github.com/Jittor/JSeg
Open Datasets	Yes	Dataset. We evaluate our methods on seven popular datasets, including Image Net-1K [14], ADE20K [111], Cityscapes [12], Pascal VOC [17], Pascal Context [65], COCO-Stuff [3], and i SAID [84].
Dataset Splits	Yes	ADE20K [111] is a challenging dataset which contains 150 semantic classes. It consists of 20,210/2,000/3,352 images in the training, validation and test sets. Cityscapes [12] mainly focuses on urban scenes and contains 5.000 high-resolution images with 19 categories. There are 2,975/500/1,525 images for training, validation and testing, respectively.
Hardware Specification	Yes	All models are trained on a node with 8 RTX 3090 GPUs. We test our method with a single RTX-3090 GPU and AMD EPYC 7543 32-core processor CPU.
Software Dependencies	No	The paper mentions software like Jittor [32], Pytorch [68], timm [85], and mmsegmentation [11] libraries, but it does not specify their version numbers.
Experiment Setup	Yes	We adopt some common data augmentation including random horizontal flipping, random scaling (from 0.5 to 2) and random cropping. The batch size is set to 8 for the Cityscapes dataset and 16 for all the other datasets. Adam W [61] is applied to train our models. We set the initial learning rate as 0.00006 and employ the poly-learning rate decay policy. We train our model 160K iterations for ADE20K, Cityscapes and i SAID datasets and 80K iterations for COCO-Stuff, Pascal VOC and Pascal Context datasets.