reproducibilityindex.ai

Universal Weakly Supervised Segmentation by Pixel-to-Segment Contrastive Learning

Authors: Tsung-Wei Ke, Jyh-Jing Hwang, Stella Yu

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments on Pascal VOC and Dense Pose demonstrate consistent gains over the state-of-the-art (SOTA), and the gain is substantial especially for the sparsest keypoint supervision.
Researcher Affiliation	Academia	Tsung-Wei Ke Jyh-Jing Hwang Stella X. Yu UC Berkeley / ICSI {twke,jyh,stellayu}@berkeley.edu
Pseudocode	Yes	Algorithm 1: Inference procedure for semantic segmentation using scribble / point / bounding box annotations. Algorithm 2: Inference procedure for semantic segmentation using image-level tags.
Open Source Code	Yes	Our code is publicly available at https://github.com/twke18/SPML.
Open Datasets	Yes	Pascal VOC 2012 Everingham et al. (2010) includes 20 object categories and one background class. Following Chen et al. (2017), we use the augmented training set with 10,582 images and validation set with 1,449 images. Dense Pose (Alp G uler et al., 2018) is a human pose parsing dataset based on MSCOCO (Lin et al., 2014).
Dataset Splits	Yes	Following Chen et al. (2017), we use the augmented training set with 10,582 images and validation set with 1,449 images.
Hardware Specification	No	For conducting experiments, we take advantage of XSEDE infrastructure (Towns et al., 2014) that includes Bridges resources (Nystrom et al., 2015).
Software Dependencies	No	The paper mentions using Deep Lab, PSPNet, ResNet101, and ImageNet as backbone/pre-training models but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions or specific library versions).
Experiment Setup	Yes	On Pascal VOC dataset, we set batchsize to 12 and 16 for scribble / point and image tag / bounding box annotations. On Dense Pose dataset, batchsize is set to 16. For all the experiments, we train our models with 512 512 cropsize. Following Chen et al. (2017), we adopt poly learning rate policy by multiplying base learning rate by 1 ( iter max iter)0.9. We set initial learning rate to 0.003, momentum to 0.9. For the hyper-parameters in Seg Sort framework, we use unit-length normalized embedding of dimension 64 and 32 on VOC and Dense Pose, respectively. We iterate K-Means clustering for 10 iterations and generate 36 and 144 clusters on VOC and Dense Pose dataset. We set the concentration parameter κ to different values for semantic annotation, low-level image similarity, semantic co-occurrence and feature afﬁnity, respectively. Moreover, λI, λO and λA are set to different values according to different types of annotations and datasets. λC is set to 1 among all the experiments. The detailed hyper-parameter settings are summarized in table 5. We train for 30k and 45k iterations on VOC and Dense Pose dataset for all the experiments.