Improved Transformer for High-Resolution GANs
Authors: Long Zhao, Zizhao Zhang, Ting Chen, Dimitris Metaxas, Han Zhang
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show in the experiments that the proposed Hi T achieves state-of-the-art FID scores of 30.83 and 2.95 on unconditional Image Net 128 128 and FFHQ 256 256, respectively, with a reasonable throughput. We validate the proposed method on three datasets: Image Net [49], Celeb A-HQ [25], and FFHQ [28]. We also adopt Image Net as the main test bed during the ablation study. |
| Researcher Affiliation | Collaboration | Long Zhao1, Zizhao Zhang2 Ting Chen3 Dimitris N. Metaxas1 Han Zhang3 1Rutgers University 2Google Cloud AI 3Google Research |
| Pseudocode | No | The detailed algorithm can be found in the supplementary materials. This indicates pseudocode or algorithm blocks are not present in the main paper. |
| Open Source Code | Yes | Our code is made publicly available at https://github.com/google-research/hit-gan. |
| Open Datasets | Yes | We validate the proposed method on three datasets: Image Net [49], Celeb A-HQ [25], and FFHQ [28]. |
| Dataset Splits | No | The paper mentions random crop for training and center crop for testing, and using specific numbers of images for testing, but does not explicitly provide details about a validation split for model training. |
| Hardware Specification | Yes | All the models are trained using TPU for one million iterations on Image Net and 500,000 iterations on FFHQ and Celeb A-HQ. The throughput is measured on a single Tesla V100 GPU. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | Our model is trained with a standard non-saturating logistic GAN loss with R1 gradient penalty [40] applied to the discriminator. R1 penalty penalizes the discriminator for deviating from the Nash-equilibrium by penalizing the gradient on real data alone. The gradient penalty weight is set to 10. Adam [29] is utilized for optimization with β1 = 0 and β2 = 0.99. The learning rate is 0.0001 for both the generator and discriminator. All the models are trained using TPU for one million iterations on Image Net and 500,000 iterations on FFHQ and Celeb A-HQ. We set the mini-batch size to 256 for the image resolution of 128 128 and 256 256 while to 32 for the resolution of 1024 1024. |