SOLOv2: Dynamic and Fast Instance Segmentation
Authors: Xinlong Wang, Rufeng Zhang, Tao Kong, Lei Li, Chunhua Shen
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that the proposed SOLOv2 achieves the state-of-the-art performance with high efficiency, making it suitable for both mobile and cloud applications. A light-weight version of SOLOv2 executes at 31.3 FPS and yields 37.1% AP on COCO test-dev. |
| Researcher Affiliation | Collaboration | Xinlong Wang1 Rufeng Zhang2 Tao Kong3 Lei Li3 Chunhua Shen1 1The University of Adelaide, Australia 2Tongji University, China 3Byte Dance AI Lab |
| Pseudocode | Yes | The pseudo-code of Matrix NMS is provided in supplementary material. |
| Open Source Code | Yes | Code is available at https://git.io/AdelaiDet |
| Open Datasets | Yes | We conduct experiments on three basic tasks, instance segmentation, object detection, and panoptic segmentation on MS COCO [22]. We also present experimental results on the recently proposed LVIS dataset [11] |
| Dataset Splits | Yes | For instance segmentation, we report lesion and sensitivity studies by evaluating on the COCO 5K val2017 split. We also report COCO mask AP on the test-dev split, which is evaluated on the evaluation server. SOLOv2 is trained with stochastic gradient descent (SGD). We use synchronized SGD over 8 GPUs with a total of 16 images per mini-batch. Unless otherwise specified, all models are trained for 36 epochs (i.e., 3 ) with an initial learning rate of 0.01, which is then divided by 10 at 27th and again at 33th epoch. We use scale jitter where the shorter image side is randomly sampled from 640 to 800 pixels. |
| Hardware Specification | Yes | All methods are evaluated using one Tesla V100 GPU. The running time is tested on our local machine, with a single V100 GPU. The Res-50-FPN SOLOv2 achieves 38.8% mask AP at 18 FPS on the challenging MS COCO dataset, evaluated on a single V100 GPU card. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., library or framework versions like PyTorch 1.9, CUDA 11.1) are explicitly mentioned in the paper. |
| Experiment Setup | Yes | SOLOv2 is trained with stochastic gradient descent (SGD). We use synchronized SGD over 8 GPUs with a total of 16 images per mini-batch. Unless otherwise specified, all models are trained for 36 epochs (i.e., 3 ) with an initial learning rate of 0.01, which is then divided by 10 at 27th and again at 33th epoch. We use scale jitter where the shorter image side is randomly sampled from 640 to 800 pixels. |