Feature Quantization Improves GAN Training

Authors: Yang Zhao, Chunyuan Li, Ping Yu, Jianfeng Gao, Changyou Chen

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results show that the proposed FQ-GAN can improve the FID scores of baseline methods by a large margin on a variety of tasks, including three representative GAN models on 9 benchmarks, achieving new state-of-the-art performance.
Researcher Affiliation Collaboration 1Department of Computer Science and Engineering, University at Buffalo, SUNY 2Microsoft Research, Redmond.
Pseudocode Yes Algorithm 1 Feature Quantization GAN
Open Source Code Yes The code is released on Github1.1https://github.com/YangNaruto/FQ-GAN
Open Datasets Yes CIFAR-10 (Krizhevsky et al., 2009) consists of 60K images at resolution 32 32 in 10 classes; 50K for training and 10K for testing. CIFAR-100 (Krizhevsky et al., 2009)... Image Net-1000 (Russakovsky et al., 2015)... The Flickr-Faces HQ (FFHQ) dataset (Karras et al., 2019a)... Five unpaired image datasets are used for evaluation, including selfie2anime (Kim et al., 2020), cat2dog, photo2portrait (Lee et al., 2018), horse2zebra and vangogh2photo (Zhu et al., 2017).
Dataset Splits No The paper states train and test splits for datasets like CIFAR-10 ('50K for training and 10K for testing') but does not explicitly provide details about a dedicated validation split for hyperparameter tuning or early stopping, nor general explicit statements about train/validation/test splits.
Hardware Specification Yes TITAN XP GPUs are used in these experiments.
Software Dependencies No The paper mentions using 'Big GAN-Py Torch' and 'Tensor Flow codes' for various models, but does not specify exact version numbers for these software libraries or any other dependencies.
Experiment Setup Yes Dictionary size K. In Figure 4 (a), we show the FQ-GAN performance with various dictionary size K = 2P . Momentum decay λ. Our experimental results in Figure 4 (c) show that λ = 0.9 is a sweet point to balance the current and historical statistics. FQ weight α. We used α = 1 for convenience. We train the model for 500 epochs, and save a model every 1000 iterations. Each model was trained using 25M images by default. Each model is trained for 100 epochs.