SPIGAN: Privileged Adversarial Learning from Simulation

Authors: Kuan-Hui Lee, German Ros, Jie Li, Adrien Gaidon

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally evaluate our approach on semantic segmentation. We train the networks on real-world Cityscapes and Vistas datasets, using only unlabeled real-world images and synthetic labeled data with z-buffer (depth) PI from the SYNTHIA dataset.
Researcher Affiliation Industry Kuan-Hui Lee, Jie Li, Adrien Gaidon Toyota Research Institute {kuan.lee,jie.li,adrien.gaidon}@tri.global & German Ros Intel Labs german.ros@intel.com
Pseudocode No The paper includes a system architecture diagram (Figure 2) but no pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any links or explicit statements about releasing source code for the methodology.
Open Datasets Yes As our source synthetic domain, we select the public SYNTHIA dataset (Ros et al., 2016) as synthetic source domain given the availability of automatic annotations and PI. For target real-world domains, we use the Cityscapes (Cordts et al., 2016) and Mapillary Vistas (Neuhold et al., 2017) datasets.
Dataset Splits Yes For this dataset, We use the standard split for training and validation with 2, 975 and 500 images respectively. [...] We use 16, 000 images for training and 2, 000 images for evaluation.
Hardware Specification No The paper does not specify the hardware used for running the experiments (e.g., GPU models, CPU types, or memory). It only mentions 'our Py Torch implementation'.
Software Dependencies No The Adam optimizer (Kingma & Ba, 2014) is used to adjust all parameters with initial learning rate 0.0002 in our Py Torch implementation (Paszke et al., 2017). While PyTorch is mentioned, a specific version number is not provided, which is crucial for reproducibility.
Experiment Setup Yes The weights in our joint adversarial loss (Eq. 1) are set to α = 1, β = 0.5, γ = 0.1, δ = 0.33, for the GAN, task, privileged, and perceptual objectives respectively. [...] We evaluate the methods with two resolutions: 320 640 and 512 1024, respectively. Images are resized to the evaluated size during training and evaluation. During training, we sample crops of size 320 320 (resp. 400 400) for lower (resp. higher) resolution experiments. In all adversarial learning cases, we do five steps of the generator for every step of the other networks. The Adam optimizer [...] with initial learning rate 0.0002.