Learning Debiased and Disentangled Representations for Semantic Segmentation

Authors: Sanghyeok Chu, Dongwan Kim, Bohyung Han

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks, with especially notable performance gains on under-represented classes. To evaluate the effectiveness of Drop Class, we experiment on a well-known semantic segmentation dataset: Cityscapes [6] with a few reasons.
Researcher Affiliation Collaboration Sanghyeok Chu Dongwan Kim Bohyung Han ECE & ASRI, Seoul National University {sanghyeok.chu,dongwan123,bhhan}@snu.ac.kr. This work was partly supported by Samsung Advanced Institute of Technology, Korean ICT R&D programs of the MSIT/IITP grant [2017-0-01779, XAI, 2021-0-01343, Artificial Intelligence Graduate School Program (Seoul National University)], and the Bio & Medical Technology Development Program of the National Research Foundation (NRF) funded by the Korea government (MSIT) [2021M3A9E4080782].
Pseudocode Yes Algorithm 1 Training scheme for Drop Class.
Open Source Code No The paper provides links to third-party network architectures (HRNet, PyTorch DeepLabV3) that were used, but does not provide a link to the authors' own implementation of the proposed 'Drop Class' methodology.
Open Datasets Yes To evaluate the effectiveness of Drop Class, we experiment on a well-known semantic segmentation dataset: Cityscapes [6] with a few reasons. First, it is relatively small, with 2975 train images. Second, it has large class imbalances, with the pixel frequency ranging from 0.1% to 36.9%. These two commonalities make it difficult for an ordinary model to learn robust, debiased representations. We also conduct experiments on the Pascal VOC dataset [14], and the results can be found in the supplementary document.
Dataset Splits Yes Cityscapes [6] with a few reasons. First, it is relatively small, with 2975 train images. To achieve this, we design an unbiased test set (I*), where the validation set is copied 19 times, and one of the 19 classes in Cityscapes is erased from each copy.
Hardware Specification Yes The rest of our hyperparameters are organized as a table in the supplementary materials, where we detail the number of iterations, batch size, learning rate, learning rate decay, image size, and type of GPU used for each of our experiments.
Software Dependencies No The paper mentions using Py Torch [16] but does not provide a specific version number for it or other software dependencies.
Experiment Setup Yes We use the same set of hyperparameters for both experiments to ensure fair comparison. The value of the loss weighing term α in Eq. (9) is set to 10 for all experiments, which is based on the scale of the two loss terms. As mentioned in Section 2.3, the value of λ is initialized as 0 and scaled linearly up to 1 across the duration of training. To further stabilize training, we linearly increase the probability of dropping out any class from 0 to 1 as well.