Multi-domain image generation and translation with identifiability guarantees

Authors: Shaoan Xie, Lingjing Kong, Mingming Gong, Kun Zhang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we first present results and analysis on multi-domain image generation task. Then we provide the results on unpaired image translation.
Researcher Affiliation Academia Shaoan Xie1, Lingjing Kong1, Mingming Gong3,2, and Kun Zhang1,2 1 Carnegie Mellon University 2Mohamed bin Zayed University of Artificial Intelligence 3The University of Melbourne
Pseudocode No The paper does not include a pseudocode block or a clearly labeled algorithm.
Open Source Code Yes The training code are available at https://github.com/Mid-Push/i-stylegan.
Open Datasets Yes We use five datasets to evaluate our method: CELEBA-HQ (Choi et al., 2020) contains female and male faces domains; AFHQ (Choi et al., 2020) contains 3 domains: cat, dog and wild life; Art Photo contains 4 domains: Cezanne, Monet, Photo and Ukiyoe; Celeb A5 contains 5 domains: Black Hair, Blonde Hair, Eyeglasses, Mustache and Pale Skin. They are subsets of domain Celeb A (Liu et al., 2015). We train them at 64 64 resolution. MNIST7 contains 7 domains: blue, cyan, green, purple, red, white and yellow MNIST digits. We generate these digits using the training MNIST dataset (Le Cun et al., 1998).
Dataset Splits No The paper does not explicitly provide details about specific training/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits).
Hardware Specification No The paper does not explicitly describe the hardware used to run its experiments (e.g., specific GPU models, CPU models, or cloud resources).
Software Dependencies No The paper states:
Experiment Setup Yes Implementation We build our method based on the official pytorch implementation of Style GAN2ADA (Karras et al., 2020a) and the hyper-parameters are selected automatically by the code. We choose the deep sigmoid flow (DSF) (Huang et al., 2018a) to implement the domain transformation fu (Huang et al., 2018a) because DSF is designed to be component-wise strictly increasing. We use the embedding of domain label to generate pseudo-parameters for the flow. We only introduce one hyper-parameter: λ to control the sparsity of the mask. We set λ = 0.1 for all experiments.