Adversarial Feature Learning
Authors: Jeff Donahue, Philipp Krähenbühl, Trevor Darrell
ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the feature learning capabilities of Bi GANs by first training them unsupervised as described in Section 3.4, then transferring the encoder s learned feature representations for use in auxiliary supervised learning tasks. To demonstrate that Bi GANs are able to learn meaningful feature representations both on arbitrary data vectors, where the model is agnostic to any underlying structure, as well as very high-dimensional and complex distributions, we evaluate on both permutation-invariant MNIST (Le Cun et al., 1998) and on the high-resolution natural images of Image Net (Russakovsky et al., 2015). |
| Researcher Affiliation | Academia | Jeff Donahue jdonahue@cs.berkeley.edu Computer Science Division University of California, Berkeley Philipp Krähenbühl philkr@utexas.edu Department of Computer Science University of Texas, Austin Trevor Darrell trevor@eecs.berkeley.edu Computer Science Division University of California, Berkeley |
| Pseudocode | No | The paper describes methods in text but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | The paper does not provide a direct statement of code release or a link to a code repository for their Bi GAN implementation. |
| Open Datasets | Yes | We evaluate on both permutation-invariant MNIST (Le Cun et al., 1998) and on the high-resolution natural images of Image Net (Russakovsky et al., 2015). |
| Dataset Splits | No | The paper mentions using 'Image Net LSVRC (Russakovsky et al., 2015) validation set' for evaluation, but does not provide specific percentages, sample counts, or detailed methodology for how the training, validation, and test splits were defined or created for reproduction beyond implicitly using standard splits of known datasets. |
| Hardware Specification | Yes | Most computation is performed on an NVIDIA Titan X or Tesla K40 GPU. |
| Software Dependencies | No | The paper mentions 'Theano (Theano Development Team, 2016)' and 'Caffe (Jia et al., 2014)' but does not provide specific version numbers for these or any other software dependencies beyond their publication year. |
| Experiment Setup | Yes | For unsupervised training of Bi GANs and baseline methods, we use the Adam optimizer (Kingma & Ba, 2015) to compute parameter updates, following the hyperparameters (initial step size α = 2 10 4, momentum β1 = 0.5 and β2 = 0.999) used by Radford et al. (2016). The step size α is decayed exponentially to α = 2 10 6 starting halfway through training. The mini-batch size is 128. ℓ2 weight decay of 2.5 10 5 is applied to all multiplicative weights in linear layers (but not to the learned bias β or scale γ parameters applied after batch normalization). Weights are initialized from a zero-mean normal distribution with a standard deviation of 0.02, with one notable exception: Bi GAN discriminator weights that directly multiply z inputs to be added to spatial convolution outputs have initializations scaled by the convolution kernel size e.g., for a 5 5 kernel, weights are initialized with a standard deviation of 0.5, 25 times the standard initialization. |