Iterative Few-shot Semantic Segmentation from Image Label Text
Authors: Haohan Wang, Liang Liu, Wuhao Zhang, Jiangning Zhang, Zhenye Gan, Yabiao Wang, Chengjie Wang, Haoqian Wang
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on PASCAL-5i and COCO-20i datasets demonstrate that our method not only outperforms the state-of-the-art weakly supervised approaches by a significant margin, but also achieves comparable or better results to recent supervised methods. |
| Researcher Affiliation | Collaboration | 1Shenzhen International Graduate School, Tsinghua University 2Tencent Youtu Lab |
| Pseudocode | No | No explicit pseudocode blocks or algorithm listings were found. |
| Open Source Code | No | Code will be available at https://github.com/Whileherham/IMR-HSNet. |
| Open Datasets | Yes | we evaluate our framework in two widely-used datasets, i.e., Pascal-5i [Shaban et al., 2017], COCO-20i [Lin et al., 2014] |
| Dataset Splits | Yes | We follow almost the same settings with regular few-shot segmentation, and the only difference is that the ground-truth masks of support images are replaced by the label text.we evaluate our framework in two widely-used datasets, i.e., Pascal-5i [Shaban et al., 2017], COCO-20i [Lin et al., 2014], and conduct the cross-validation over all the folds in each dataset. For each fold i, samples in the remaining folds serve as the training data, while the 1000 episodes (support-query pairs) are sampled randomly as the testing data. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned. The paper only mentions using Res Net50 and VGG16 as backbones and the CLIP model's internal architecture (modified ResNet50 and transformer). |
| Software Dependencies | No | The paper mentions using "Adam optimizer" and pre-trained models like "Image Net pretrained VGG16 and Res Net50" and the "pre-trained CLIP model", but does not specify software dependencies with version numbers (e.g., PyTorch version, TensorFlow version, or specific library versions). |
| Experiment Setup | Yes | The network is trained with Adam optimizer with learning rate 2e 4, and all hyperparameters λt and ws are set to one by default. However, our method is robust to the choice of these hyperparameters, as shown in the supplementary materials. Consistent with HSNet, we resize the input into 400 400 in both the training and testing stage, and we do not adopt any data augmentation strategies. |