Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks
Authors: Jost Tobias Springenberg
ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically evaluate our method which we dub categorical generative adversarial networks (or Cat GAN) on synthetic data as well as on challenging image classification tasks, demonstrating the robustness of the learned classifiers. We further qualitatively assess the fidelity of samples generated by the adversarial generator that is learned alongside the discriminative classifier, and identify links between the Cat GAN objective and discriminative clustering algorithms (such as RIM). |
| Researcher Affiliation | Academia | Jost Tobias Springenberg University of Freiburg 79110 Freiburg, Germany springj@cs.uni-freiburg.de |
| Pseudocode | No | The paper provides detailed mathematical formulations of its objective functions and describes the training procedure in text, but it does not include a dedicated pseudocode block or algorithm listing. |
| Open Source Code | No | The paper thanks developers of Theano and Lasagne for 'sharing research code' but does not explicitly state that the code for the methodology described in this paper is open-source or provide a link to it. |
| Open Datasets | Yes | We performed experiments using fully connected and convolutional networks on MNIST (Le Cun et al., 1989) and CIFAR-10 (Krizhevsky & Hinton, 2009). |
| Dataset Splits | Yes | In this second step we simply looked at 100 examples from a validation set (we always kept 10000 examples from the training set for validation) for which we assume the correct labeling to be known, and assigned each pseudo category yk to be indicative of one of the true classes ci {1 . . . 10}. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types). |
| Software Dependencies | No | The paper mentions software like Theano, Lasagne, Adam, and SMORMS3, often with citations to their respective papers, but it does not specify the version numbers of these software dependencies as used in their experiments. |
| Experiment Setup | Yes | More specifically, we use batch size B = 100 in all experiments and approximate the expectations in Equation (7) and Equation (9) using 100 random examples from X, X L and the generator G(z) respectively. We then do one gradient ascent step on the objective for the discriminator followed by one gradient descent step on the objective for the generator. We also added noise to all layers as mentioned in the main paper. ... Gaussian noise added to the batch normalized hidden activations to yield slightly better performance. ... Gaussian noise with standard deviation 0.05 ... The noise dimensionality for vectors z was chosen to be 10 for. ... The noise dimensionality for vectors z was chosen as Z = 128 and the cost weighting factor λ was simply set to λ = 1. |