Self-Diagnosing GAN: Diagnosing Underrepresented Samples in Generative Adversarial Networks
Authors: Jinhee Lee, Haeri Kim, Youngkyu Hong, Hye Won Chung
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results demonstrate that the proposed method improves GAN performance on various datasets, and it is especially effective in improving the quality and diversity of sample generation for minor groups. |
| Researcher Affiliation | Collaboration | Jinhee Lee School of Electrical Engineering KAIST jin.lee@kaist.ac.kr Haeri Kim Samsung Research Samsung Electronics haeri.kim@samsung.com Youngkyu Hong NAVER AI Lab NAVER youngkyu.hong@navercorp.com Hye Won Chung School of Electrical Engineering KAIST hwchung@kaist.ac.kr |
| Pseudocode | No | The overall algorithm (with details in the Appendix D) can be summarized as below: Phase 1 Train and Diagnose: Train GAN and evaluate the discrepancy score for each data instance. Phase 2 Score-Based Weighted Sampling: Encourage GAN to learn underrepresented regions of data manifold through score-based weighted sampling (Section 4.1). Phase 3 DRS: After GAN training, correct the model distribution pg(x) by rejection sampling. |
| Open Source Code | Yes | Our code is publicly available at https://github.com/grayhong/self-diagnosing-gan. |
| Open Datasets | Yes | We train SNGAN [25] on CIFAR-10 [17] for 40k steps and measure the discrepancy score of each sample. We further evaluate the scalability of our method with Style GAN2 [16] on FFHQ 256x256 [15] dataset. To control the level of minority, we design a Colored MNIST dataset with red (major) and green (minor) samples, and MNIST-FNIST dataset with MNIST (major) FMNIST (minor) samples, with the majority rates ρ {90, 95, 99}%. |
| Dataset Splits | No | We train our model for 50k (75k) steps for CIFAR-10 (Celeb A), where for our method and GOLD, the phase 1 takes 40k (60k) steps, and the phase 2 takes the remaining. We record LDR every 100 steps and use the last 50 records for calculating the discrepancy score. |
| Hardware Specification | No | Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See Appendix D. |
| Software Dependencies | No | The paper does not explicitly state specific software dependencies with version numbers (e.g., library or framework versions like PyTorch 1.9, TensorFlow 2.x). |
| Experiment Setup | Yes | We train our model for 50k (75k) steps for CIFAR-10 (Celeb A), where for our method and GOLD, the phase 1 takes 40k (60k) steps, and the phase 2 takes the remaining. We record LDR every 100 steps and use the last 50 records for calculating the discrepancy score. For the discrepancy score (7), we use k = 0.3 (5.0) for CIFAR-10 (Celeb A). Detailed configurations and hyperparameter search procedure are available in the Appendix F. |