Mask Matching Transformer for Few-Shot Segmentation
Authors: siyu jiao, Gengwei Zhang, Shant Navasardyan, Ling Chen, Yao Zhao, Yunchao Wei, Humphrey Shi
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on the popular COCO-20i and Pascal-5i benchmarks. Competitive results well demonstrate the effectiveness and the generalization ability of our MM-Former. |
| Researcher Affiliation | Collaboration | Siyu Jiao1,2 , Gengwei Zhang3, Shant Navasardyan4, Ling Chen3, Yao Zhao1,2, Yunchao Wei1,2, Humphrey Shi 4 1 Institute of Information Science, Beijing Jiaotong University 2 Beijing Key Laboratory of Advanced Information Science and Network 3 AAII, University of Technology Sydney 4 Picsart AI Research (PAIR) |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at github.com/Picsart-AI-Research/Mask-Matching-Transformer. |
| Open Datasets | Yes | We conduct experiments on two popular few-shot segmentation benchmarks, Pascal-5i [9] and COCO-20i [14], to evaluate our method. Pascal-5i with extra mask annotations SBD [10] consisting of 20 classes are separated into 4 splits. COCO-20i consists of annotated images from 80 classes. |
| Dataset Splits | No | The paper describes training and testing splits for Pascal-5i (15 classes for training, 5 for testing) and COCO-20i (60 classes for training, 20 for testing), but does not explicitly mention a separate validation dataset split. |
| Hardware Specification | Yes | All the experiments are conducted on a single RTX A6000 GPU. |
| Software Dependencies | No | The paper mentions software components like 'Adam W optimizer' and 'Res Net-50 backbone' and refers to frameworks like 'Mask2Former', but does not provide specific version numbers for software dependencies (e.g., Python, PyTorch, or other libraries with their versions). |
| Experiment Setup | Yes | For the first stage, we freeze the Image Net [6] pre-trained backbone. The POS is trained on Pascal-5i for 20,000 iterations and 60,000 iterations on COCO-20i, respectively. Learning rate is set to 1e 4, batch size is set to 8. For the second stage, we freeze the parameters of the backbone and the POS, and only train the MM module for 10,000/20,000 iterations on Pascal-5i / COCO-20i, respectively. Learning rate is set to 1e 4, batch size is set to 4. For both stages, we use Adam W [17] optimizer with a weight decay of 5e 2. The learning rate is decreased using the poly schedule with a factor of 0.9. All images are resized and cropped into 480 480 for training. We also employ random horizontal flipping and random crop techniques for data augmentation. |