Self-Diagnosing GAN: Diagnosing Underrepresented Samples in Generative Adversarial Networks

Authors: Jinhee Lee, Haeri Kim, Youngkyu Hong, Hye Won Chung

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental results demonstrate that the proposed method improves GAN performance on various datasets, and it is especially effective in improving the quality and diversity of sample generation for minor groups.
Researcher Affiliation Collaboration Jinhee Lee School of Electrical Engineering KAIST jin.lee@kaist.ac.kr Haeri Kim Samsung Research Samsung Electronics haeri.kim@samsung.com Youngkyu Hong NAVER AI Lab NAVER youngkyu.hong@navercorp.com Hye Won Chung School of Electrical Engineering KAIST hwchung@kaist.ac.kr
Pseudocode No The overall algorithm (with details in the Appendix D) can be summarized as below: Phase 1 Train and Diagnose: Train GAN and evaluate the discrepancy score for each data instance. Phase 2 Score-Based Weighted Sampling: Encourage GAN to learn underrepresented regions of data manifold through score-based weighted sampling (Section 4.1). Phase 3 DRS: After GAN training, correct the model distribution pg(x) by rejection sampling.
Open Source Code Yes Our code is publicly available at https://github.com/grayhong/self-diagnosing-gan.
Open Datasets Yes We train SNGAN [25] on CIFAR-10 [17] for 40k steps and measure the discrepancy score of each sample. We further evaluate the scalability of our method with Style GAN2 [16] on FFHQ 256x256 [15] dataset. To control the level of minority, we design a Colored MNIST dataset with red (major) and green (minor) samples, and MNIST-FNIST dataset with MNIST (major) FMNIST (minor) samples, with the majority rates ρ {90, 95, 99}%.
Dataset Splits No We train our model for 50k (75k) steps for CIFAR-10 (Celeb A), where for our method and GOLD, the phase 1 takes 40k (60k) steps, and the phase 2 takes the remaining. We record LDR every 100 steps and use the last 50 records for calculating the discrepancy score.
Hardware Specification No Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See Appendix D.
Software Dependencies No The paper does not explicitly state specific software dependencies with version numbers (e.g., library or framework versions like PyTorch 1.9, TensorFlow 2.x).
Experiment Setup Yes We train our model for 50k (75k) steps for CIFAR-10 (Celeb A), where for our method and GOLD, the phase 1 takes 40k (60k) steps, and the phase 2 takes the remaining. We record LDR every 100 steps and use the last 50 records for calculating the discrepancy score. For the discrepancy score (7), we use k = 0.3 (5.0) for CIFAR-10 (Celeb A). Detailed configurations and hyperparameter search procedure are available in the Appendix F.