Semi-supervised Learning with GANs: Manifold Invariance with Improved Inference
Authors: Abhishek Kumar, Prasanna Sattigeri, Tom Fletcher
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We observe considerable empirical gains in semi-supervised learning over baselines, particularly in the cases when the number of labeled examples is low. We also provide insights into how fake examples influence the semi-supervised learning procedure. |
| Researcher Affiliation | Collaboration | Abhishek Kumar IBM Research AI Yorktown Heights, NY abhishk@us.ibm.com Prasanna Sattigeri IBM Research AI Yorktown Heights, NY psattig@us.ibm.com P. Thomas Fletcher University of Utah Salt Lake City, UT fletcher@sci.utah.edu |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not contain any statements about making its source code publicly available. |
| Open Datasets | Yes | We plot several of the quantities of interest discussed above for MNIST (with 100 labeled examples) and SVHN (with 1000 labeled examples) datasets in Fig.1. Semi-supervised learning results. Table 1 shows the results for SVHN and CIFAR10 with various number of labeled examples. |
| Dataset Splits | No | The paper mentions 'test' and 'training set' for a classifier, but does not explicitly describe validation dataset splits or percentages for the main model training and evaluation. |
| Hardware Specification | No | The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions certain components like ADAM optimizer and ELU nonlinearities, but does not provide specific software package names with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, Python 3.x). |
| Experiment Setup | Yes | Implementation Details. The architecture of the endoder, generator and discriminator closely follow the network structures in ALI [9]. We remove the stochastic layer from the ALI encoder (i.e., h(x) is deterministic). For estimating the dominant tangents, we employ fully connected two-layer network with tanh non-linearly in the hidden layer to represent p p. The hyperparameters λ1 and λ2 in Eq. (7) are set to 1. All results for the proposed methods (last 3 rows) are obtained with training the model for 600 epochs for SVHN and 900 epochs for CIFAR10, and are averaged over 5 runs. We follow [34] completely for optimization (using ADAM optimizer [15] with the same learning rates as in [34]). |