Multi-domain image generation and translation with identifiability guarantees
Authors: Shaoan Xie, Lingjing Kong, Mingming Gong, Kun Zhang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we first present results and analysis on multi-domain image generation task. Then we provide the results on unpaired image translation. |
| Researcher Affiliation | Academia | Shaoan Xie1, Lingjing Kong1, Mingming Gong3,2, and Kun Zhang1,2 1 Carnegie Mellon University 2Mohamed bin Zayed University of Artificial Intelligence 3The University of Melbourne |
| Pseudocode | No | The paper does not include a pseudocode block or a clearly labeled algorithm. |
| Open Source Code | Yes | The training code are available at https://github.com/Mid-Push/i-stylegan. |
| Open Datasets | Yes | We use five datasets to evaluate our method: CELEBA-HQ (Choi et al., 2020) contains female and male faces domains; AFHQ (Choi et al., 2020) contains 3 domains: cat, dog and wild life; Art Photo contains 4 domains: Cezanne, Monet, Photo and Ukiyoe; Celeb A5 contains 5 domains: Black Hair, Blonde Hair, Eyeglasses, Mustache and Pale Skin. They are subsets of domain Celeb A (Liu et al., 2015). We train them at 64 64 resolution. MNIST7 contains 7 domains: blue, cyan, green, purple, red, white and yellow MNIST digits. We generate these digits using the training MNIST dataset (Le Cun et al., 1998). |
| Dataset Splits | No | The paper does not explicitly provide details about specific training/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits). |
| Hardware Specification | No | The paper does not explicitly describe the hardware used to run its experiments (e.g., specific GPU models, CPU models, or cloud resources). |
| Software Dependencies | No | The paper states: |
| Experiment Setup | Yes | Implementation We build our method based on the official pytorch implementation of Style GAN2ADA (Karras et al., 2020a) and the hyper-parameters are selected automatically by the code. We choose the deep sigmoid flow (DSF) (Huang et al., 2018a) to implement the domain transformation fu (Huang et al., 2018a) because DSF is designed to be component-wise strictly increasing. We use the embedding of domain label to generate pseudo-parameters for the flow. We only introduce one hyper-parameter: λ to control the sparsity of the mask. We set λ = 0.1 for all experiments. |