Iterative Few-shot Semantic Segmentation from Image Label Text

Authors: Haohan Wang, Liang Liu, Wuhao Zhang, Jiangning Zhang, Zhenye Gan, Yabiao Wang, Chengjie Wang, Haoqian Wang

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on PASCAL-5i and COCO-20i datasets demonstrate that our method not only outperforms the state-of-the-art weakly supervised approaches by a significant margin, but also achieves comparable or better results to recent supervised methods.
Researcher Affiliation Collaboration 1Shenzhen International Graduate School, Tsinghua University 2Tencent Youtu Lab
Pseudocode No No explicit pseudocode blocks or algorithm listings were found.
Open Source Code No Code will be available at https://github.com/Whileherham/IMR-HSNet.
Open Datasets Yes we evaluate our framework in two widely-used datasets, i.e., Pascal-5i [Shaban et al., 2017], COCO-20i [Lin et al., 2014]
Dataset Splits Yes We follow almost the same settings with regular few-shot segmentation, and the only difference is that the ground-truth masks of support images are replaced by the label text.we evaluate our framework in two widely-used datasets, i.e., Pascal-5i [Shaban et al., 2017], COCO-20i [Lin et al., 2014], and conduct the cross-validation over all the folds in each dataset. For each fold i, samples in the remaining folds serve as the training data, while the 1000 episodes (support-query pairs) are sampled randomly as the testing data.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned. The paper only mentions using Res Net50 and VGG16 as backbones and the CLIP model's internal architecture (modified ResNet50 and transformer).
Software Dependencies No The paper mentions using "Adam optimizer" and pre-trained models like "Image Net pretrained VGG16 and Res Net50" and the "pre-trained CLIP model", but does not specify software dependencies with version numbers (e.g., PyTorch version, TensorFlow version, or specific library versions).
Experiment Setup Yes The network is trained with Adam optimizer with learning rate 2e 4, and all hyperparameters λt and ws are set to one by default. However, our method is robust to the choice of these hyperparameters, as shown in the supplementary materials. Consistent with HSNet, we resize the input into 400 400 in both the training and testing stage, and we do not adopt any data augmentation strategies.