reproducibilityindex.ai

Good Semi-supervised Learning That Requires a Bad GAN

Authors: Zihang Dai, Zhilin Yang, Fan Yang, William W. Cohen, Russ R. Salakhutdinov

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we derive a novel formulation based on our analysis that substantially improves over feature matching GANs, obtaining state-of-the-art results on multiple benchmark datasets. ... Empirically, our approach substantially improves over vanilla feature matching GANs, and obtains new state-of-the-art results on MNIST, SVHN, and CIFAR-10 ... 6 Experiments We mainly consider three widely used benchmark datasets, namely MNIST, SVHN, and CIFAR-10. ... Table 1: Comparison with state-of-the-art methods on three benchmark datasets. ... Table 2: Ablation study.
Researcher Affiliation	Academia	Zihang Dai , Zhilin Yang , Fan Yang, William W. Cohen, Ruslan Salakhutdinov School of Computer Science Carnegie Melon University dzihang,zhiliny,fanyang1,wcohen,rsalakhu@cs.cmu.edu
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/kimiyoung/ssl_bad_gan.
Open Datasets	Yes	We mainly consider three widely used benchmark datasets, namely MNIST, SVHN, and CIFAR-10.
Dataset Splits	Yes	As in previous work, we randomly sample 100, 1,000, and 4,000 labeled samples for MNIST, SVHN, and CIFAR-10 respectively during training, and use the standard data split for testing.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., library names with versions).
Experiment Setup	Yes	We add instance noise to the input of the discriminator [1, 18], and use spatial dropout [20] to obtain faster convergence. Except for these two modiﬁcations, we use the same neural network architecture as in [16]. We use the 10-quantile log probability to deﬁne the threshold ϵ in Eq. (4).