Diversity-Sensitive Conditional Generative Adversarial Networks
Authors: Dingdong Yang, Seunghoon Hong, Yunseok Jang, Tianchen Zhao, Honglak Lee
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of our method on three conditional generation tasks: image-to-image translation, image inpainting, and future video prediction. We show that simple addition of our regularization to existing models leads to surprisingly diverse generations, substantially outperforming the previous approaches for multi-modal conditional generation specifically designed in each individual task. |
| Researcher Affiliation | Collaboration | University of Michigan, Ann Arbor, MI, USA Google Brain, Mountain View, CA, USA |
| Pseudocode | No | The paper contains mathematical derivations but no explicit blocks or figures labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | No | We are going to release the code and datasets upon the acceptance of the paper. |
| Open Datasets | Yes | We evaluate the results on three datasets: label image (Radim Tyleˇcek, 2013), edge photo (Zhu et al., 2016; Yu & Grauman, 2014), map image (Isola et al., 2017)...Table 3 shows the comparison results on Cityscape dataset (Cordts et al., 2016)...we take 256 256 images of centered faces from the celeb A dataset (Liu et al., 2015)...We conduct experiments on two datasets from the previous literature: the BAIR action-free robot pushing dataset (Ebert et al., 2017) and the KTH human actions dataset (Schuldt et al., 2004). |
| Dataset Splits | Yes | We use Bicycle GAN s generator and discriminator structures as a baseline c GAN. The baseline model has exactly the same hyperparameters as Bicycle GAN including the weight of pixel-wise L1 loss and GAN loss. The baseline model setting then basically becomes a pix2pix image-to-image translation setting (Isola et al., 2017)...We downloaded the pre-processed data provided by the authors (Lee et al., 2018) and used it directly for our experiment...We download the pre-processed videos from Villegas et al. (2017); Denton & Fergus (2018) |
| Hardware Specification | No | The paper does not mention any specific hardware details such as GPU models, CPU types, or memory specifications used for experiments. |
| Software Dependencies | No | The paper mentions software components like 'U-Net', 'patch GAN', 'VGGNet', 'Alex Net', 'Inception V3', and 'convolutional LSTM' but does not specify their version numbers or the version of the programming language used. |
| Experiment Setup | Yes | Our full objective function can be written as: min G max D Lc GAN(G, D) λLz(G), (3) where λ controls an importance of the regularization...we use the following objective: min G max D Lc GAN(G, D) + βLrec(G) λLz(G), (5)...Unless otherwise stated, we use l1 distance for Lrec(G) = G(x, z) y and l1 norm for Lz(G)1...We also conducted an experiment by varying a length of latent code z. Table 2 summarizes the results. As discussed in Zhu et al. (2017b), generation quality of Bicycle GAN degrades with highdimensional z due to the difficulties in matching the encoder distribution E(x) with prior distribution p(z). Compared to Bicycle GAN, our method is less suffered from such issue by sampling z from the prior distribution, thus exhibits consistent performance over various latent code sizes. |