Progressive Neighborhood Aggregation for Semantic Segmentation Refinement

Authors: Ting Liu, Yunchao Wei, Yanning Zhang

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on five segmentation datasets, including Pascal VOC 2012, City Scapes, COCOStuff 10k, Deep Globe, and Trans10k, demonstrate that the proposed framework can be cascaded into existing segmentation models providing consistent improvements. We conduct extensive experiments on this dataset to evaluate the effects of each component of our proposed refinement framework. The ablation studies on Pascal VOC 2012 validation dataset of the proposed PNA based on Res Net50.
Researcher Affiliation Academia Ting Liu1*, Yunchao Wei2, Yanning Zhang1 1 Northwestern Polytechnical University, China 2 Beijing Jiaotong University, China
Pseudocode No The paper does not contain any explicitly labeled pseudocode or algorithm blocks. It provides figures illustrating the framework, but not step-by-step code-like procedures.
Open Source Code Yes The code is available at https://github.com/liutinglt/PNA.
Open Datasets Yes We conduct experiments on five publicly available segmentation benchmarks, PASCAL VOC 2012 (Everingham et al. 2015), City Scapes (Cordts et al. 2016), COCO-Stuff 10k (Caesar, Uijlings, and Ferrari 2018), Deep Globe (Demir et al. 2018), and Trans10k (Xie et al. 2020).
Dataset Splits Yes It consists of 21 classes and splits 1,464 images for training, 1,449 for validation, and 1,456 for test. We use the augmented training set (Hariharan et al. 2011) including 10,582 images for training.
Hardware Specification Yes All the models are trained on four 3090 GPUs with a batch size of 16.
Software Dependencies No The paper mentions using 'Adam W optimizer' but does not specify any software versions for libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages (e.g., Python version) used for implementation.
Experiment Setup Yes Adam W optimizer with an initial learning rate of 1e-4 is adopted to optimize models, and the learning rate is decayed following the polynomial annealing policy with a power of 0.9. All the models are trained on four 3090 GPUs with a batch size of 16. During training, the images are randomly cropped to 512 512, and all our implemented models were trained for 20k iterations.