Switching Temporary Teachers for Semi-Supervised Semantic Segmentation

Authors: Jaemin Na, Jung-Woo Ha, Hyung Jin Chang, Dongyoon Han, Wonjun Hwang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Ultimately, our approach achieves competitive performance with state-of-the-art methods [41, 15, 24] while requiring much shorter training times and fewer parameters. We demonstrate the superiority of Dual Teacher under the semi-supervised semantic segmentation protocol using public benchmarks, including PASCAL VOC [9] and Cityscapes [8]. We further prove the scalability of our method using the large-scale dataset, ADE20K [55], and have made our experimental protocols available to the public. We conduct extensive experiments not only with Res Net-50/-101 but also with Transformerbased Seg Former [43]. In this section, we demonstrate the efficacy of our proposed method by conducting comprehensive comparisons with state-of-the-art approaches on various publicly available benchmarks. Furthermore, we provide additional ablation studies to justify the significance and impact of our method.
Researcher Affiliation Collaboration Jaemin Na1, Jung-Woo Ha2, Hyung Jin Chang3, Dongyoon Han2 , Wonjun Hwang1,2 Ajou University, Korea1, NAVER AI Lab2, University of Birmingham, UK3
Pseudocode No The paper provides mathematical objective functions and descriptions of processes, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code.
Open Source Code Yes Code is available at https://github.com/naver-ai/dual-teacher.
Open Datasets Yes We evaluate our method on three public benchmarks in semantic segmentation, including the PASCAL VOC [9], Cityscapes [8], and ADE20K [55] datasets.
Dataset Splits Yes The fine-annotated images are split into training, validation, and test sets with 2,975, 500, and 1,525 images, respectively. It consists of 25,574, 2,000, and 3,000 images in the training, validation, and test set, respectively.
Hardware Specification No The paper describes the model architectures (Deep Labv3+, Res Net-50/-101, Seg Former) and datasets used, along with training hyperparameters, but it does not specify any particular hardware (e.g., GPU models, CPU types, or memory) used to conduct the experiments.
Software Dependencies No The paper mentions using the stochastic gradient descent (SGD) optimizer and refers to model architectures like Deep Labv3+ and Seg Former. It also cites MMSegmentation [7] as a toolbox. However, it does not provide specific version numbers for programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or other key software libraries and their dependencies.
Experiment Setup Yes For the training on PASCAL VOC, we employ the stochastic gradient descent (SGD) optimizer with an initial learning rate of 0.001 and weight decay of 0.001. We set a batch size to 16 and run 80 epochs. Following the previous protocol [46], we use an image resolution of 321 321 for the original high-quality training set while configuring the blender set to 513 513. For Cityscapes, we also use SGD optimizer with an initial learning rate of 0.01, weight decay of 1e-4, batch size of 16, crop size of 769 769, and total training epochs of 200. For the training on the ADE20K, we follow the recipe of Seg Former [43, 7] and set the learning rate of 6e-5 and poly LR scheduler with a factor of 1.0. ... We train the methods with a batch size of 8 and crop size of 512 512 for the Mi T-B1 backbone. Following the previous work [41], we progressively increase EMA weights up to 0.99 across all experiments.