Unsupervised Scene Adaptation with Memory Regularization in vivo

Authors: Zhedong Zheng, Yi Yang

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Albeit simple, we verify the effectiveness of memory regularization on two synthetic-to-real benchmarks: GTA5 Cityscapes and SYNTHIA Cityscapes, yielding +11.1% and +11.3% m Io U improvement over the baseline model, respectively. Besides, a similar +12.0% m Io U improvement is observed on the cross-city benchmark: Cityscapes Oxford Robot Car.
Researcher Affiliation Collaboration Zhedong Zheng1,2 , Yi Yang1 1Re LER, University of Technology Sydney, Australia 2Baidu Research, China
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our Pytorch implementation is available at 1. [1] https://github.com/layumi/Seg-Uncertainty
Open Datasets Yes We mainly evaluate the proposed method on the two unsupervised scene adaption settings, i.e., GTA5 [Richter et al., 2016] Cityscapes [Cordts et al., 2016] and SYNTHIA [Ros et al., 2016] Cityscapes [Cordts et al., 2016]. Both source datasets, i.e., GTA5 and SYNTHIA, are the synthetic datasets. GTA5 contains 24, 966 training images, while SYNTHIA has 9, 400 images for training. The target dataset, Cityscapes, is collected in the realistic scenario, including 2, 975 unlabeled training images. Besides, we also evaluate the proposed method on the cross-city benchmark: Cityscapes [Cordts et al., 2016] Oxford Robot Car [Maddern et al., 2017].
Dataset Splits Yes We follow the setting in [Tsai et al., 2019] and evaluate the model on the Cityscapes validation set/ Robot Car validation set.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions 'Pytorch' and 'Paddle Paddle' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes The input image is resized to 1280 640, and we randomly crop 1024 512 for training. We deploy the SGD optimizer with the batch size 2 for the segmentation model, and the initial learning rate is set to 0.0002. The optimizer of the discriminator is Adam and the learning rate is set to 0.0001. Following [Zhao et al., 2017; Zhang et al., 2019], both segmentation model and discriminator deploy the ploy learning rate decay by multiplying the factor (1 iter total iter)0.9. We set the total iteration as 100k iteration and adopt the early-stop policy. The model is first trained without the memory regularization for 10k to avoid the initial prediction noise, and then we add the memory regularization to the model training. For Stage-I, we train the model with 25k iterations. We further finetune the model in the Stage-II for 25k iterations. We also adopt the class balance policy in the [Zou et al., 2018] to increase the weight of the rare class, and the small-scale objects. When inference, we combine the outputs of both classifiers ˆyj t = arg max(Fp(xj t) + 0.5Fa(xj t)). We follow the setting in PSPNet [Zhao et al., 2017] to set 0.5 for segmentation losses on the auxiliary classifier. Lseg = Lp seg + 0.5La seg, Lpseg = Lp pseg + 0.5La pseg. For adversarial losses, we follow the setting in [Tsai et al., 2018; Luo et al., 2019b], and select small weights for adversarial loss terms Ladv = 0.001Lp adv +0.0002La adv. Besides, we fix the weight of memory regularization as λmr = 0.1 for all experiments.