SOLOv2: Dynamic and Fast Instance Segmentation

Authors: Xinlong Wang, Rufeng Zhang, Tao Kong, Lei Li, Chunhua Shen

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that the proposed SOLOv2 achieves the state-of-the-art performance with high efficiency, making it suitable for both mobile and cloud applications. A light-weight version of SOLOv2 executes at 31.3 FPS and yields 37.1% AP on COCO test-dev.
Researcher Affiliation Collaboration Xinlong Wang1 Rufeng Zhang2 Tao Kong3 Lei Li3 Chunhua Shen1 1The University of Adelaide, Australia 2Tongji University, China 3Byte Dance AI Lab
Pseudocode Yes The pseudo-code of Matrix NMS is provided in supplementary material.
Open Source Code Yes Code is available at https://git.io/AdelaiDet
Open Datasets Yes We conduct experiments on three basic tasks, instance segmentation, object detection, and panoptic segmentation on MS COCO [22]. We also present experimental results on the recently proposed LVIS dataset [11]
Dataset Splits Yes For instance segmentation, we report lesion and sensitivity studies by evaluating on the COCO 5K val2017 split. We also report COCO mask AP on the test-dev split, which is evaluated on the evaluation server. SOLOv2 is trained with stochastic gradient descent (SGD). We use synchronized SGD over 8 GPUs with a total of 16 images per mini-batch. Unless otherwise specified, all models are trained for 36 epochs (i.e., 3 ) with an initial learning rate of 0.01, which is then divided by 10 at 27th and again at 33th epoch. We use scale jitter where the shorter image side is randomly sampled from 640 to 800 pixels.
Hardware Specification Yes All methods are evaluated using one Tesla V100 GPU. The running time is tested on our local machine, with a single V100 GPU. The Res-50-FPN SOLOv2 achieves 38.8% mask AP at 18 FPS on the challenging MS COCO dataset, evaluated on a single V100 GPU card.
Software Dependencies No No specific software dependencies with version numbers (e.g., library or framework versions like PyTorch 1.9, CUDA 11.1) are explicitly mentioned in the paper.
Experiment Setup Yes SOLOv2 is trained with stochastic gradient descent (SGD). We use synchronized SGD over 8 GPUs with a total of 16 images per mini-batch. Unless otherwise specified, all models are trained for 36 epochs (i.e., 3 ) with an initial learning rate of 0.01, which is then divided by 10 at 27th and again at 33th epoch. We use scale jitter where the shorter image side is randomly sampled from 640 to 800 pixels.