SOIT: Segmenting Objects with Instance-Aware Transformers
Authors: Xiaodong Yu, Dahu Shi, Xing Wei, Ye Ren, Tingqun Ye, Wenming Tan3188-3196
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on the MS COCO dataset demonstrate that SOIT outperforms state-of-the-art instance segmentation approaches significantly. Moreover, the joint learning of multiple tasks in a unified query embedding can also substantially improve the detection performance. Code is available at https://github.com/yuxiaodong HRI/SOIT. |
| Researcher Affiliation | Collaboration | 1 Hikvision Research Institute, Hangzhou, China 2 School of Software Engineering, Xi an Jiaotong University |
| Pseudocode | No | No pseudocode or clearly labeled algorithm blocks were found in the paper. |
| Open Source Code | Yes | Code is available at https://github.com/yuxiaodong HRI/SOIT. |
| Open Datasets | Yes | We validate our method on COCO benchmark (Lin et al. 2014). COCO 2017 dataset contains 115k images for training (split train2017), 5k for validation (split val2017) and 20k for testing (split test-dev), involving 80 object categories with instance-level segmentation annotations. |
| Dataset Splits | Yes | COCO 2017 dataset contains 115k images for training (split train2017), 5k for validation (split val2017) and 20k for testing (split test-dev), involving 80 object categories with instance-level segmentation annotations. Following the common practice, our models are trained with split train2017, and all the ablation experiments are evaluated on split val2017. |
| Hardware Specification | Yes | All experiments are conducted on 16 NVIDIA Tesla V100 GPUs with a total batch size of 32. |
| Software Dependencies | No | The paper mentions 'Adam optimizer' but does not specify version numbers for any software dependencies like Python, PyTorch, CUDA, or other libraries. |
| Experiment Setup | Yes | We train our model with Adam optimizer (Kingma and Ba 2015) with base learning rate of 2.0 10 4, momentum of 0.9 and weight decay of 1.0 10 4. Models are trained for 50 epochs, and the initial learning rate is decayed at 40th epoch by a factor of 0.1. Multi-scale training is adopted, where the shorter side is randomly chosen within [480, 800] and the longer side is less or equal to 1333. When testing, the input image is resized to have the shorter side being 800 and the longer side less or equal 1333. All experiments are conducted on 16 NVIDIA Tesla V100 GPUs with a total batch size of 32. |