Adversarial Feature Learning

Authors: Jeff Donahue, Philipp Krähenbühl, Trevor Darrell

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the feature learning capabilities of Bi GANs by first training them unsupervised as described in Section 3.4, then transferring the encoder s learned feature representations for use in auxiliary supervised learning tasks. To demonstrate that Bi GANs are able to learn meaningful feature representations both on arbitrary data vectors, where the model is agnostic to any underlying structure, as well as very high-dimensional and complex distributions, we evaluate on both permutation-invariant MNIST (Le Cun et al., 1998) and on the high-resolution natural images of Image Net (Russakovsky et al., 2015).
Researcher Affiliation Academia Jeff Donahue jdonahue@cs.berkeley.edu Computer Science Division University of California, Berkeley Philipp Krähenbühl philkr@utexas.edu Department of Computer Science University of Texas, Austin Trevor Darrell trevor@eecs.berkeley.edu Computer Science Division University of California, Berkeley
Pseudocode No The paper describes methods in text but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No The paper does not provide a direct statement of code release or a link to a code repository for their Bi GAN implementation.
Open Datasets Yes We evaluate on both permutation-invariant MNIST (Le Cun et al., 1998) and on the high-resolution natural images of Image Net (Russakovsky et al., 2015).
Dataset Splits No The paper mentions using 'Image Net LSVRC (Russakovsky et al., 2015) validation set' for evaluation, but does not provide specific percentages, sample counts, or detailed methodology for how the training, validation, and test splits were defined or created for reproduction beyond implicitly using standard splits of known datasets.
Hardware Specification Yes Most computation is performed on an NVIDIA Titan X or Tesla K40 GPU.
Software Dependencies No The paper mentions 'Theano (Theano Development Team, 2016)' and 'Caffe (Jia et al., 2014)' but does not provide specific version numbers for these or any other software dependencies beyond their publication year.
Experiment Setup Yes For unsupervised training of Bi GANs and baseline methods, we use the Adam optimizer (Kingma & Ba, 2015) to compute parameter updates, following the hyperparameters (initial step size α = 2 10 4, momentum β1 = 0.5 and β2 = 0.999) used by Radford et al. (2016). The step size α is decayed exponentially to α = 2 10 6 starting halfway through training. The mini-batch size is 128. ℓ2 weight decay of 2.5 10 5 is applied to all multiplicative weights in linear layers (but not to the learned bias β or scale γ parameters applied after batch normalization). Weights are initialized from a zero-mean normal distribution with a standard deviation of 0.02, with one notable exception: Bi GAN discriminator weights that directly multiply z inputs to be added to spatial convolution outputs have initializations scaled by the convolution kernel size e.g., for a 5 5 kernel, weights are initialized with a standard deviation of 0.5, 25 times the standard initialization.