Weakly Supervised Few-shot Object Segmentation using Co-Attention with Visual and Semantic Embeddings
Authors: Mennatullah Siam, Naren Doraiswamy, Boris N. Oreshkin, Hengshuai Yao, Martin Jagersand
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we demonstrate results of experiments conducted on the PASCAL-5i dataset [Shaban et al., 2017] compared to state of the art methods in section 5.2. We then demonstrate the results for the different variants of our approach depicted in Fig. 3 and experiment with the proposed TOSFL setup in section 5.3. |
| Researcher Affiliation | Collaboration | 1 University of Alberta 2 Indian Institute of Science 3 Element AI 4 Hi Silicon, Huawei Research |
| Pseudocode | No | The paper includes figures describing the model architecture, but it does not provide any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | We utilize a Res Net-50 [He et al., 2016] encoder pre-trained on Image Net [Deng et al., 2009] to extract visual features... PASCAL-5i splits PASCAL-VOC 20 classes into 4 folds... Table 2 demonstrates results on MS-COCO [Lin et al., 2014]... Our setup relies on the image-level label for the support image to segment different parts from the query image conditioned on the word embeddings of this image-level label. We utilize Youtube-VOS dataset training data which has 65 classes, and we split them into 5 folds. |
| Dataset Splits | Yes | In order to ensure the evaluation for the few-shot method is not biased to a certain category, it is best to split into multiple folds and evaluate on different ones similar to [Shaban et al., 2017]... PASCAL-5i splits PASCAL-VOC 20 classes into 4 folds each having 5 classes... In each fold the model is meta-trained for a maximum number of 50 epochs on the classes outside the test fold on pascal-5i, and 20 epochs on both MS-COCO and Youtube-VOS. |
| Hardware Specification | No | The paper describes the training process and parameters (e.g., 'momentum SGD', 'Batch size of 4'), but it does not specify any hardware components like GPU models, CPU types, or memory used for the experiments. |
| Software Dependencies | No | The paper mentions models like 'Res Net-50' and optimizers like 'momentum SGD', but it does not list specific software dependencies with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, CUDA 11.x). |
| Experiment Setup | Yes | We train all models using momentum SGD with learning rate 0.01 that is reduced by 0.1 at epoch 35, 40 and 45 and momentum 0.9. L2 regularization with a factor of 5x10 4 is used to avoid over-fitting. Batch size of 4 and input resolution of 321 321 are used during training with random horizontal flipping and random centered cropping for the support set. An input resolution of 500 500 is used for the meta-testing phase similar to [Shaban et al., 2017]. |