reproducibilityindex.ai

Self-Diagnosing GAN: Diagnosing Underrepresented Samples in Generative Adversarial Networks

Authors: Jinhee Lee, Haeri Kim, Youngkyu Hong, Hye Won Chung

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results demonstrate that the proposed method improves GAN performance on various datasets, and it is especially effective in improving the quality and diversity of sample generation for minor groups.
Researcher Affiliation	Collaboration	Jinhee Lee School of Electrical Engineering KAIST jin.lee@kaist.ac.kr Haeri Kim Samsung Research Samsung Electronics haeri.kim@samsung.com Youngkyu Hong NAVER AI Lab NAVER youngkyu.hong@navercorp.com Hye Won Chung School of Electrical Engineering KAIST hwchung@kaist.ac.kr
Pseudocode	No	The overall algorithm (with details in the Appendix D) can be summarized as below: Phase 1 Train and Diagnose: Train GAN and evaluate the discrepancy score for each data instance. Phase 2 Score-Based Weighted Sampling: Encourage GAN to learn underrepresented regions of data manifold through score-based weighted sampling (Section 4.1). Phase 3 DRS: After GAN training, correct the model distribution pg(x) by rejection sampling.
Open Source Code	Yes	Our code is publicly available at https://github.com/grayhong/self-diagnosing-gan.
Open Datasets	Yes	We train SNGAN [25] on CIFAR-10 [17] for 40k steps and measure the discrepancy score of each sample. We further evaluate the scalability of our method with Style GAN2 [16] on FFHQ 256x256 [15] dataset. To control the level of minority, we design a Colored MNIST dataset with red (major) and green (minor) samples, and MNIST-FNIST dataset with MNIST (major) FMNIST (minor) samples, with the majority rates ρ {90, 95, 99}%.
Dataset Splits	No	We train our model for 50k (75k) steps for CIFAR-10 (Celeb A), where for our method and GOLD, the phase 1 takes 40k (60k) steps, and the phase 2 takes the remaining. We record LDR every 100 steps and use the last 50 records for calculating the discrepancy score.
Hardware Specification	No	Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See Appendix D.
Software Dependencies	No	The paper does not explicitly state specific software dependencies with version numbers (e.g., library or framework versions like PyTorch 1.9, TensorFlow 2.x).
Experiment Setup	Yes	We train our model for 50k (75k) steps for CIFAR-10 (Celeb A), where for our method and GOLD, the phase 1 takes 40k (60k) steps, and the phase 2 takes the remaining. We record LDR every 100 steps and use the last 50 records for calculating the discrepancy score. For the discrepancy score (7), we use k = 0.3 (5.0) for CIFAR-10 (Celeb A). Detailed conﬁgurations and hyperparameter search procedure are available in the Appendix F.