Harmonizing Maximum Likelihood with GANs for Multimodal Conditional Generation
Authors: Soochan Lee, Junsoo Ha, Gunhee Kim
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that our approach is applicable to any conditional generation tasks by performing thorough experiments on image-to-image translation, super-resolution and image inpainting using Cityscapes and Celeb A dataset. Quantitative evaluations also confirm that our methods achieve a great diversity in outputs while retaining or even improving the visual fidelity of generated samples. 5 EXPERIMENTS In order to show the generality of our methods, we apply them to three conditional generation tasks: image-to-image translation, super-resolution and image inpainting, for each of which we select Pix2Pix, SRGAN (Ledig et al., 2017) and GLCIC (Iizuka et al., 2017) as base models, respectively. We use Maps (Isola et al., 2017) and Cityscapes dataset (Cordts et al., 2016) for image translation and Celeb A dataset (Liu et al., 2015) for the other tasks. |
| Researcher Affiliation | Academia | Soochan Lee, Junsoo Ha & Gunhee Kim Department of Computer Science and Engineering, Seoul National University, Seoul, Korea soochan.lee@vision.snu.ac.kr, junsooha@hanyang.ac.kr, gunhee@snu.ac.kr |
| Pseudocode | Yes | A ALGORITHMS We elaborate on the algorithms of all eight variants of our methods in detail from Algorithm 5 to Algorithm 4. |
| Open Source Code | No | The paper mentions a project website (http://vision.snu.ac.kr/projects/mr-gan) and bases for their variants on existing open-source projects (e.g., 'Our Pix2Pix variant is based on the U-net generator from https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix.'), but it does not explicitly provide a direct link to the specific source code for their proposed MR-GAN and proxy MR-GAN implementations. |
| Open Datasets | Yes | We use Maps (Isola et al., 2017) and Cityscapes dataset (Cordts et al., 2016) for image translation and Celeb A dataset (Liu et al., 2015) for the other tasks. |
| Dataset Splits | No | The paper states 'In case of proxy MR-GAN, we train the predictor until it is overfitted, and use the checkpoint with the lowest validation loss.' but does not provide specific details on the training, validation, and test dataset splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments, such as GPU models, CPU types, or cloud computing specifications. |
| Software Dependencies | No | The paper states 'We use Py Torch for the implementation of our methods.' but does not specify the version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | In every experiment, we use AMSGrad optimizer (Reddi et al., 2018) with LR = 10 4, β1 = 0.5, β2 = 0.999. We use the weight decay of a rate 10 4 and the gradient clipping by a value 0.5. [...] Batch sizes: We use 16 for the discriminator and the predictor and 8 for the generator. When training the generator, we generate 10 samples for each input, therefore its total batch size is 80. Loss weights: We set λMR = λp MR = 10. For the baseline, we use ℓ1 loss as the reconstruction loss and set λℓ1 = 100. |