Unsupervised Image-to-Image Translation Networks

Authors: Ming-Yu Liu, Thomas Breuel, Jan Kautz

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare the proposed framework with competing approaches and present high quality image translation results on various challenging unsupervised image translation tasks, including street scene image translation, animal image translation, and face image translation. We also apply the proposed framework to domain adaptation and achieve state-of-the-art performance on benchmark datasets. Code and additional results are available in https://github.com/mingyuliutw/unit.
Researcher Affiliation Industry Ming-Yu Liu, Thomas Breuel, Jan Kautz NVIDIA {mingyul,tbreuel,jkautz}@nvidia.com
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Code and additional results are available in https://github.com/mingyuliutw/unit.
Open Datasets Yes We used the map dataset [8] (visualized in Figure 2), which contained corresponding pairs of images in two domains (satellite image and map) useful for quantitative evaluation. ... We also applied the approach to several tasks including adapting from the Street View House Number (SVHN) dataset [20] to the MNIST dataset and adapting between the MNIST and USPS datasets.
Dataset Splits Yes We operated in an unsupervised setting where we used the 1096 satellite images from the training set as the first domain and 1098 maps from the validation set as the second domain.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU model, CPU model, memory).
Software Dependencies No The paper mentions using ADAM for training but does not provide specific version numbers for any software libraries, frameworks, or programming languages used (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We used ADAM [11] for training where the learning rate was set to 0.0001 and momentums were set to 0.5 and 0.999. Each mini-batch consisted of one image from the first domain and one image from the second domain. Our framework had several hyper-parameters. The default values were λ0 = 10, λ3 = λ1 = 0.1 and λ4 = λ2 = 100. ... We trained for 100K iterations