Balanced Training for Sparse GANs
Authors: Yite Wang, Jing Wu, NAIRA HOVAKIMYAN, Ruoyu Sun
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our proposed method shows promising results on multiple datasets, demonstrating its effectiveness. Table 1: FID ( ) of different sparse training methods with no constraint on the density of the discriminator. |
| Researcher Affiliation | Academia | 1University of Illinois Urbana-Champaign, USA 2School of Data Science, The Chinese University of Hong Kong, Shenzhen, China 3Shenzhen International Center for Industrial and Applied Mathematics, Shenzhen Research Institute of Big Data {yitew2,jingwu6,nhovakim}@illinois.edu, sunruoyu@cuhk.edu.cn |
| Pseudocode | Yes | Algorithm 1 Dynamic density adjust (DDA) for the discriminator. |
| Open Source Code | Yes | Our code is available at https://github.com/Yite Wang/ADAPT. |
| Open Datasets | Yes | We conduct experiments on SNGAN with Res Net architectures on the CIFAR-10 [40] and the STL-10 [11] datasets. We have also conducted experiments with Big GAN [7] on the CIFAR-10 and Tiny Image Net dataset (with Diff Aug [93]). |
| Dataset Splits | Yes | We use the training set of CIFAR-10, the unlabeled partition of STL-10, and the training set of Tiny Image Net for GAN training... For the CIFAR-10 dataset, we report both FID for the training set and test set, whereas, for the STL-10 dataset, we report the FID of the unlabeled partition. |
| Hardware Specification | No | The paper mentions utilizing resources supported by a grant and refers to 'Hal: Computer system for scalable deep learning' [39], but it does not explicitly list specific hardware components such as GPU models, CPU models, or memory within its own text. |
| Software Dependencies | No | The paper mentions using Adam optimizer, hinge loss, and exponential moving average (EMA) but does not provide specific version numbers for software libraries or dependencies, such as Python version, PyTorch version, or CUDA version. |
| Experiment Setup | Yes | We use a learning rate of 2 10 4 for both generators and discriminators. The discriminator is updated five times for every generator update. We adopt Adam optimizer with β1 = 0 and β2 = 0.9. The batch size of the discriminator and the generator is set to 64 and 128, respectively. Hinge loss is used following [7, 9]. We use exponential moving average (EMA) [89] with β = 0.999. The generator is trained for a total of 100k iterations. |