ReMaX: Relaxing for Better Training on Efficient Panoptic Segmentation

Authors: Shuyang Sun, WEIJUN WANG, Andrew Howard, Qihang Yu, Philip Torr, Liang-Chieh Chen

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our study of Re Ma X involves analyzing its performance on three commonly used image segmentation datasets. COCO [44] supports semantic, instance, and panoptic segmentation with 80 things and 53 stuff categories; Cityscapes [17] consists of 8 things and 11 stuff categories; and ADE20K [77] contains 100 things and 50 stuff categories. We evaluate our method using the Panoptic Quality (PQ) metric defined in [36] (for panoptic segmentation), the Average Precision defined in [44] (for instance segmentation), and the m Io U [19] metric (for semantic segmentation).
Researcher Affiliation Collaboration Shuyang Sun1 Weijun Wang2 Qihang Yu Andrew Howard2 Philip Torr1 Liang-Chieh Chen 1University of Oxford 2Google Research
Pseudocode No The paper includes mathematical formulations and block diagrams (e.g., Figure 3), but it does not contain any clearly labeled 'Pseudocode' or 'Algorithm' sections.
Open Source Code Yes Code and pre-trained checkpoints will be available at https://github.com/ google-research/deeplab2.
Open Datasets Yes Our study of Re Ma X involves analyzing its performance on three commonly used image segmentation datasets. COCO [44] supports semantic, instance, and panoptic segmentation with 80 things and 53 stuff categories; Cityscapes [17] consists of 8 things and 11 stuff categories; and ADE20K [77] contains 100 things and 50 stuff categories.
Dataset Splits Yes Re Ma X significantly improves the training convergence and outperforms the baseline by a large margin. As shown in Figure 5, we can see that when training the model under different training schedules 50K, 100K and 150K, our method outperform the baselines by a clear margin for all different schedules.
Hardware Specification Yes Our models are trained using a batch size of 32 on 32 TPUv3 cores, with a total of 60K iterations.
Software Dependencies No The entire framework is implemented with Deep Lab2 [68] in Tensor Flow [1].
Experiment Setup Yes The learning rate for the Image Net-pretrained [56] backbone is multiplied with a smaller learning rate factor 0.1. For training augmentations, we adopt multi-scale training by randomly scaling the input images with a scaling ratio from 0.3 to 1.7 and then cropping it into resolution 1281 1281. ... Adam W [34, 49] optimizer is used with weight decay 0.005 for short schedule 50K and 100K with a batch size 64. For long schedule, we set the weight decay to 0.02. The initial learning rate is set to 0.006, which is multiplied by a decay factor of 0.1 when the training reaches 85% and 95% of the total iterations. ... The η for Re Class operation is set to 0.1.