Balanced Training for Sparse GANs

Authors: Yite Wang, Jing Wu, NAIRA HOVAKIMYAN, Ruoyu Sun

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our proposed method shows promising results on multiple datasets, demonstrating its effectiveness. Table 1: FID ( ) of different sparse training methods with no constraint on the density of the discriminator.
Researcher Affiliation Academia 1University of Illinois Urbana-Champaign, USA 2School of Data Science, The Chinese University of Hong Kong, Shenzhen, China 3Shenzhen International Center for Industrial and Applied Mathematics, Shenzhen Research Institute of Big Data {yitew2,jingwu6,nhovakim}@illinois.edu, sunruoyu@cuhk.edu.cn
Pseudocode Yes Algorithm 1 Dynamic density adjust (DDA) for the discriminator.
Open Source Code Yes Our code is available at https://github.com/Yite Wang/ADAPT.
Open Datasets Yes We conduct experiments on SNGAN with Res Net architectures on the CIFAR-10 [40] and the STL-10 [11] datasets. We have also conducted experiments with Big GAN [7] on the CIFAR-10 and Tiny Image Net dataset (with Diff Aug [93]).
Dataset Splits Yes We use the training set of CIFAR-10, the unlabeled partition of STL-10, and the training set of Tiny Image Net for GAN training... For the CIFAR-10 dataset, we report both FID for the training set and test set, whereas, for the STL-10 dataset, we report the FID of the unlabeled partition.
Hardware Specification No The paper mentions utilizing resources supported by a grant and refers to 'Hal: Computer system for scalable deep learning' [39], but it does not explicitly list specific hardware components such as GPU models, CPU models, or memory within its own text.
Software Dependencies No The paper mentions using Adam optimizer, hinge loss, and exponential moving average (EMA) but does not provide specific version numbers for software libraries or dependencies, such as Python version, PyTorch version, or CUDA version.
Experiment Setup Yes We use a learning rate of 2 10 4 for both generators and discriminators. The discriminator is updated five times for every generator update. We adopt Adam optimizer with β1 = 0 and β2 = 0.9. The batch size of the discriminator and the generator is set to 64 and 128, respectively. Hinge loss is used following [7, 9]. We use exponential moving average (EMA) [89] with β = 0.999. The generator is trained for a total of 100k iterations.