MarginGAN: Adversarial Training in Semi-Supervised Learning
Authors: Jinhao Dong, Tong Lin
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on benchmark datasets testify that Margin GAN is orthogonal to several state-of-the-art methods, offering improved error rates and shorter training time as well. |
| Researcher Affiliation | Academia | Jinhao Dong School of Computer Science and Technology, Xidian University Xi an 710126, China jhdong@stu.xidian.edu.cnTong Lin Key Laboratory of Machine Perception, MOE School of EECS, Peking University, Beijing, & Peng Cheng Laboratory, Shenzhen lintong@pku.edu.cn |
| Pseudocode | No | The paper describes the architecture and loss functions of Margin GAN but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statement or link indicating that the source code for Margin GAN is publicly available. |
| Open Datasets | Yes | MNIST consists of a training set of 60,000 images and a test set of 10,000 images... SVHN contains of 73,257 digits for training and 26,032 digits for testing... The CIFAR-10 dataset consists of 50,000 training samples and 10,000 test samples... |
| Dataset Splits | No | The paper specifies labeled and unlabeled samples for training and a separate test set, but it does not explicitly define or provide sizes for a distinct validation dataset split. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions frameworks and models (e.g., 'info GAN', 'residual network', 'Shake-Shake regularization', 'mean teacher training') but does not specify software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | When training, we first pretrain the classifier to achieve the error rate lower than 8.0%, 9.3%, 9.5% and 9.7%, only with the labeled examples, respectively corresponding to 100, 600, 1000 and 3000 labeled examples. Then, the unlabeled samples and generated samples engage in the training process. In the ablation study, because of the instability of pseudo labels and lack of labeled examples in some cases, we decrease the learning rate from 0.1 to 0.01. We employ a 12-block residual network [23] with Shake-Shake regularization [24] as our classifier, which is same as the ResNet version used in the mean teacher implementation. Our algorithm integrates the generator and the discriminator from the info GAN [22] into this residual network. We also use the mean teacher training for averaging model weights over recent training examples. |