Semantic Diffusion Network for Semantic Segmentation
Authors: Haoru Tan, Sitong Wu, Jimin Pi
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that our approach can achieve consistent improvements over several typical and state-of-the-art segmentation baseline models on challenging public benchmarks. To evaluate its effectiveness, we integrate our SDN into multiple baseline segmentation models and conduct experiments on two challenging benchmarks (i.e., ADE20K [62] and Cityscapes [9]). |
| Researcher Affiliation | Collaboration | Haoru Tan Sitong Wu Baidu Research tanhr2014@163.com, wusitong98@gmail.com jpi@connect.ust.hk |
| Pseudocode | No | No pseudocode or clearly labeled algorithm blocks were found in the paper. |
| Open Source Code | No | The paper's self-assessment indicates code is available, but the main body of the paper does not provide an explicit statement or URL for open-source code. |
| Open Datasets | Yes | Experiments are conducted on two widely-used public benchmarks: ADE20K [62] is a very challenging benchmark including 150 categories, which is split into 20000 and 2000 images for training and validation. Cityscapes [9] is a real-world scene parsing benchmark, which contains over 5000 urban scene images with 19 classes. The number of images for training, validation, and testing are 2975, 500, and 1525, respectively. |
| Dataset Splits | Yes | ADE20K [62] is a very challenging benchmark including 150 categories, which is split into 20000 and 2000 images for training and validation. Cityscapes [9] is a real-world scene parsing benchmark, which contains over 5000 urban scene images with 19 classes. The number of images for training, validation, and testing are 2975, 500, and 1525, respectively. |
| Hardware Specification | Yes | All the experiments are implemented with Py Torch [1] and conducted on 4 NVIDIA V100 GPUs. |
| Software Dependencies | No | The paper mentions PyTorch but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | Specifically, for CNN baselines (i.e., FCN and Semantic FPN), we use the SGD optimizer with momentum 0.9 and weight decay 0.0005. The learning rate is initialized at 0.01 and decays until 1e-4 by a polynomial strategy with power 0.9. For Transformer baselines (i.e., Segmenter), Adam W is used as the optimizer without weight decay. The learning rate is initialized as 6e-5 and decays until 0 via a polynomial strategy with power 1.0. We train the models with 160k and 80k iterations for ADE20K and Cityscapes, respectively. For ADE20K, we train the models with 160k iterations, each of which involves 16 images. For Cityscapes, the training contains 80k iterations with a batch size of 8. Synchronized BN [32] is used to synchronize the mean and standard deviation of BN [38] across multiple GPUs. The backbone is initialized by the Image Net-1K [36] and Image Net-22K [10] pre-trained weights for Res Net-50 [14] and Vi T-B [11], respectively. |