Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Authors: Shengyu Zhao, Jonathan Cui, Yilun Sheng, Yue Dong, Xiao Liang, Eric I-Chao Chang, Yan Xu

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate superior performance in terms of both quality and diversity over state-of-the-art methods in free-form image completion and easy generalization to image-to-image translation. We conduct image completion experiments at 512 512 resolution on the FFHQ dataset (Karras et al., 2019a) and the Places2 dataset (Zhou et al., 2017).
Researcher Affiliation Collaboration Shengyu Zhao IIIS, Tsinghua University and Microsoft Research Jonathan Cui Vacaville Christian Schools Yilun Sheng IIIS, Tsinghua University and Microsoft Research Yue Dong IIIS, Tsinghua University Xiao Liang The High School Affiliated to Renmin University of China Eric I-Chao Chang Microsoft Research Yan Xu School of Biological Science and Medical Engineering and Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks, nor are there any sections explicitly labeled 'Pseudocode' or 'Algorithm'.
Open Source Code Yes Code is available at https://github.com/zsyzzsoft/co-mod-gan.
Open Datasets Yes We conduct image completion experiments at 512 512 resolution on the FFHQ dataset (Karras et al., 2019a) and the Places2 dataset (Zhou et al., 2017).
Dataset Splits Yes We preserve 10k out of 70k images from the FFHQ dataset for validation. Places2 has its own validation set of 36.5k images and a large training set of 8M images.
Hardware Specification Yes All the experiments are run on 8 cards of NVIDIA Tesla V100 GPUs.
Software Dependencies No The paper mentions using the Adam optimizer and borrowing details from StyleGAN2, but it does not provide specific version numbers for software dependencies like Python, PyTorch, or CUDA.
Experiment Setup Yes We mostly borrow the network details and hyperparameters from Style GAN2 (Karras et al., 2019b), including the number of convolutional layers (2) at each level, the number of channels (64 at 512 512 resolution, doubled at each coarser level with a maximum of 512), architecture of the mapping network M (8-layer MLP), layer-wise noise injection, style mixing regularization (with a probability of 0.5 instead), non-saturating logistic loss (Goodfellow et al., 2014) with R1 regularization (Mescheder et al., 2018) of γ = 10, and the Adam optimizer (Kingma & Ba, 2014) with a learning rate of 0.002. [...] The batch size is 4 per GPU, 32 in total. The training length is 25M images unless specified, which takes about 1 week at 512 512 resolution.