ReMaX: Relaxing for Better Training on Efficient Panoptic Segmentation
Authors: Shuyang Sun, WEIJUN WANG, Andrew Howard, Qihang Yu, Philip Torr, Liang-Chieh Chen
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our study of Re Ma X involves analyzing its performance on three commonly used image segmentation datasets. COCO [44] supports semantic, instance, and panoptic segmentation with 80 things and 53 stuff categories; Cityscapes [17] consists of 8 things and 11 stuff categories; and ADE20K [77] contains 100 things and 50 stuff categories. We evaluate our method using the Panoptic Quality (PQ) metric defined in [36] (for panoptic segmentation), the Average Precision defined in [44] (for instance segmentation), and the m Io U [19] metric (for semantic segmentation). |
| Researcher Affiliation | Collaboration | Shuyang Sun1 Weijun Wang2 Qihang Yu Andrew Howard2 Philip Torr1 Liang-Chieh Chen 1University of Oxford 2Google Research |
| Pseudocode | No | The paper includes mathematical formulations and block diagrams (e.g., Figure 3), but it does not contain any clearly labeled 'Pseudocode' or 'Algorithm' sections. |
| Open Source Code | Yes | Code and pre-trained checkpoints will be available at https://github.com/ google-research/deeplab2. |
| Open Datasets | Yes | Our study of Re Ma X involves analyzing its performance on three commonly used image segmentation datasets. COCO [44] supports semantic, instance, and panoptic segmentation with 80 things and 53 stuff categories; Cityscapes [17] consists of 8 things and 11 stuff categories; and ADE20K [77] contains 100 things and 50 stuff categories. |
| Dataset Splits | Yes | Re Ma X significantly improves the training convergence and outperforms the baseline by a large margin. As shown in Figure 5, we can see that when training the model under different training schedules 50K, 100K and 150K, our method outperform the baselines by a clear margin for all different schedules. |
| Hardware Specification | Yes | Our models are trained using a batch size of 32 on 32 TPUv3 cores, with a total of 60K iterations. |
| Software Dependencies | No | The entire framework is implemented with Deep Lab2 [68] in Tensor Flow [1]. |
| Experiment Setup | Yes | The learning rate for the Image Net-pretrained [56] backbone is multiplied with a smaller learning rate factor 0.1. For training augmentations, we adopt multi-scale training by randomly scaling the input images with a scaling ratio from 0.3 to 1.7 and then cropping it into resolution 1281 1281. ... Adam W [34, 49] optimizer is used with weight decay 0.005 for short schedule 50K and 100K with a batch size 64. For long schedule, we set the weight decay to 0.02. The initial learning rate is set to 0.006, which is multiplied by a decay factor of 0.1 when the training reaches 85% and 95% of the total iterations. ... The η for Re Class operation is set to 0.1. |