Semi-supervised Learning with GANs: Manifold Invariance with Improved Inference

Authors: Abhishek Kumar, Prasanna Sattigeri, Tom Fletcher

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We observe considerable empirical gains in semi-supervised learning over baselines, particularly in the cases when the number of labeled examples is low. We also provide insights into how fake examples influence the semi-supervised learning procedure.
Researcher Affiliation Collaboration Abhishek Kumar IBM Research AI Yorktown Heights, NY abhishk@us.ibm.com Prasanna Sattigeri IBM Research AI Yorktown Heights, NY psattig@us.ibm.com P. Thomas Fletcher University of Utah Salt Lake City, UT fletcher@sci.utah.edu
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not contain any statements about making its source code publicly available.
Open Datasets Yes We plot several of the quantities of interest discussed above for MNIST (with 100 labeled examples) and SVHN (with 1000 labeled examples) datasets in Fig.1. Semi-supervised learning results. Table 1 shows the results for SVHN and CIFAR10 with various number of labeled examples.
Dataset Splits No The paper mentions 'test' and 'training set' for a classifier, but does not explicitly describe validation dataset splits or percentages for the main model training and evaluation.
Hardware Specification No The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No The paper mentions certain components like ADAM optimizer and ELU nonlinearities, but does not provide specific software package names with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, Python 3.x).
Experiment Setup Yes Implementation Details. The architecture of the endoder, generator and discriminator closely follow the network structures in ALI [9]. We remove the stochastic layer from the ALI encoder (i.e., h(x) is deterministic). For estimating the dominant tangents, we employ fully connected two-layer network with tanh non-linearly in the hidden layer to represent p p. The hyperparameters λ1 and λ2 in Eq. (7) are set to 1. All results for the proposed methods (last 3 rows) are obtained with training the model for 600 epochs for SVHN and 900 epochs for CIFAR10, and are averaged over 5 runs. We follow [34] completely for optimization (using ADAM optimizer [15] with the same learning rates as in [34]).