Label-efficient Segmentation via Affinity Propagation
Authors: Wentong Li, Yuqian Yuan, Song Wang, Wenyu Liu, Dongqi Tang, Jian liu, Jianke Zhu, Lei Zhang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on three typical label-efficient segmentation tasks, i.e. box-supervised instance segmentation, point/scribble-supervised semantic segmentation and CLIP-guided semantic segmentation, demonstrate the superior performance of the proposed approach. |
| Researcher Affiliation | Collaboration | Wentong Li1 , Yuqian Yuan1 , Song Wang1, Wenyu Liu1, Dongqi Tang2, Jian Liu2, Jianke Zhu1 , Lei Zhang3 1Zhejiang University 2Ant Group 3The Hong Kong Polytechnical University |
| Pseudocode | Yes | Algorithm 1: Algorithm for GP process |
| Open Source Code | No | https://Li Wentomng.github.io/apro/ - This is a project page/personal homepage, not a direct link to a source code repository, nor is there an explicit statement about code release. |
| Open Datasets | Yes | We conduct experiments on two widely used datasets for the weakly box-supervised instance segmentation task: COCO [49]... Pascal VOC [43] augmented by SBD [50] based on the original Pascal VOC 2012 [51]... We conduct experiments on the widely-used Pascal VOC2012 dataset [51]... Pascal Context [60]... COCO-Stuff [61] |
| Dataset Splits | Yes | COCO [49], which has 80 classes with 115K train2017 images and 5K val2017 images. Pascal VOC [43] augmented by SBD [50] based on the original Pascal VOC 2012 [51], which has 20 classes with 10,582 trainaug images and 1,449 val images. |
| Hardware Specification | Yes | The experiment is conducted on a single GeForce RTX 3090 with batch size 1. |
| Software Dependencies | No | We follow the commonly used training settings on each dataset as in MMDetection [56]. The paper mentions software tools and frameworks but does not provide specific version numbers for them. |
| Experiment Setup | Yes | The initial learning rate is set to 10^-4 and the weight decay is 0.05 with 16 images per mini-batch. For Mask2Former framework [53], the large-scale jittering augmentation scheme [58] is employed with a random scale sampled within range [0.1, 2.0], followed by a fixed size crop to 1024x1024. The input size is 512x512. The SGD optimizer with momentum of 0.9 and weight decay of 10^-4 is used. The initial learning rate is 0.001, and there are 80k training iterations. |