Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation

Authors: Beomyoung Kim, Sangeun Han, Junmo Kim1754-1761

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the effectiveness of our approach. achieves m Io U 71.4% on the PASCAL VOC 2012 segmentation benchmark using only image-level labels.
Researcher Affiliation Academia Beomyoung Kim, Sangeun Han, Junmo Kim Korea Advanced Institute of Science and Technology (KAIST) {qjadud1994, bichoomi, junmo.kim}@kaist.ac.kr
Pseudocode Yes Algorithm 1: Discriminative Region Suppression
Open Source Code No The paper mentions using 'Deep Lab-Large-FOV code1 and Deep Lab-ASPP code2 implemented based on the Pytorch framework' with GitHub links in footnotes (1https://github.com/wangleihitcs/Deep Lab-V1-Py Torch, 2https://github.com/kazuto1011/deeplab-pytorch). However, these are external codebases used by the authors, not their own source code for the proposed DRS method.
Open Datasets Yes We demonstrate the effectiveness of the proposed approach on the PASCAL VOC 2012 segmentation benchmark dataset (Everingham et al. 2014)
Dataset Splits Yes Following the common practice in previous works, the training set is augmented to 10,582 images. We evaluate the performance of our model using the mean intersection-over-union (m Io U) metric and compare it with other state-of-the-art methods on the validation (1,449 images) and test set (1,456 images).
Hardware Specification Yes All experiments are performed on NVIDIA TITAN XP.
Software Dependencies No Our method is implemented on Pytorch (Paszke et al. 2017). The paper mentions PyTorch but does not specify a version number for it or other software dependencies.
Experiment Setup Yes The initial learning rate is set to 1e-3 and is decreased by a factor of 10 at epoch 5 and 10. For data augmentation, we apply a random crop with 321 321 size, random horizontal flipping, and random color jittering. We use a batch size of 5 and train the classification network for 15 epochs. We optimize the refinement network for the refinement learning with MSE loss using Adam (Kingma and Ba 2014) optimizer with a learning rate of 1e-4. The batch size is 5, the total training epoch is 15, and the learning rate is dropped by a factor of 10 at epoch 5 and 10. When generating pseudo segmentation labels, we empirically choose α = 0.2 for object cues and β = 0.06 for background cues.