Harmonic Unpaired Image-to-image Translation

Authors: Rui Zhang, Tomas Pfister, Jia Li

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show experimental results in a number of applications including medical imaging, object transfiguration, and semantic labeling. We outperform the competing methods in all tasks, and for a medical imaging task in particular our method turns Cycle GAN from a failure to a success, halving the mean-squared error, and generating images that radiologists prefer over competing methods in 95% of cases.
Researcher Affiliation Collaboration Rui Zhang Google Cloud AI & Chinese Academy of Sciences Beijing, China zhangrui@ict.ac.cn Tomas Pfister Google Cloud AI Sunnyvale, USA tpfister@google.com Jia Li Google Cloud AI Sunnyvale, USA lijiali@google.com
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks clearly labeled as "Algorithm" or "Pseudocode". Figure 3 is an architecture diagram, not pseudocode.
Open Source Code No The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper.
Open Datasets Yes Medical imaging. This task evaluates cross-modal medical image synthesis, Flair T1. The models are trained on the BRATS dataset (Menze et al., 2015)... Semantic labeling. We also test our method on the labels photos task using the Cityscapes dataset (Cordts et al., 2016)... Object transfiguration. Finally, we test our method on the horse zebra task using the standard Cycle GAN dataset...
Dataset Splits Yes Similar to the previous work (Cohen et al., 2018), we use a training set of 1400 image slices (50% healthy and 50% tumors) and a test set of 300, and use their unpaired training scenario... for labels photos we adopt the FCN score (Isola et al., 2017)... standard Cycle GAN dataset (2401 training images, 260 test images).
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. It mentions using a VGG network which implies GPU usage, but no specific models are named.
Software Dependencies No The paper mentions software components like "Adam optimizer" and "VGG network" but does not provide specific version numbers for any software, libraries, or frameworks used (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Similar with Cycle GAN, we adopt the architecture of (Johnson et al., 2016) as the generator and the Patch GAN (Isola et al., 2017) as the discriminator. The log likelihood objective in the original GAN is replaced with a least-squared loss (Mao et al., 2017) for more stable training. We resize the input images to the size of 256 256. For the histogram feature, we equally split the RGB range of [0, 255] to 16 bins, each with a range of 16. Images are divided into non-overlapping patches of 8 8 and the histogram feature is computed on each patch. For the semantic feature, we adopt a VGG network pre-trained on Image Net to obtain semantic features. We select the feature map of layer relu4 3 in VGG. The loss weights are set as λGAN = λSmooth = 1, λcyc = 10. Following Cycle GAN, we adopt the Adam optimizer (Kingma & Ba, 2015) with a learning rate of 0.0002. The learning rate is fixed for the first 100 epochs and linearly decayed to zero over the next 100 epochs.