Explaining and Harnessing Adversarial Examples

Authors: Ian Goodfellow, Jon Shlens, and Christian Szegedy

ICLR 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using this approach to provide examples for adversarial training, we reduce the test set error of a maxout network on the MNIST dataset.
Researcher Affiliation Industry Ian J. Goodfellow, Jonathon Shlens & Christian Szegedy Google Inc., Mountain View, CA {goodfellow,shlens,szegedy}@google.com
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper includes a link to 'preprocessing code' for Pylearn2 but not to the source code for the main methodology (e.g., the fast gradient sign method or adversarial training implementation itself). '2 See https://github.com/lisa-lab/pylearn2/tree/master/pylearn2/scripts/ papers/maxout. for the preprocessing code, which yields a standard deviation of roughly 0.5.'
Open Datasets Yes reduce the test set error of a maxout network on the MNIST dataset. ... on Image Net (Deng et al., 2009). ... CIFAR-10 (Krizhevsky & Hinton, 2009).
Dataset Splits Yes The original maxout result uses early stopping, and terminates learning after the validation set error rate has not decreased for 100 epochs. ... We found that the validation set error leveled off over time, and made very slow progress. ... We therefore used early stopping on the adversarial validation set error.
Hardware Specification No No specific hardware details (e.g., GPU model, CPU type, memory) used for running experiments are mentioned.
Software Dependencies No The paper mentions 'Theano', 'Pylearn2', and 'Dist Belief' as software used, but does not provide specific version numbers for these dependencies.
Experiment Setup Yes Let θ be the parameters of a model, x the input to the model, y the targets associated with x (for machine learning tasks that have targets) and J(θ, x, y) be the cost used to train the neural network. We can linearize the cost function around the current value of θ, obtaining an optimal max-norm constrained pertubation of η = ϵsign ( x J(θ, x, y)) . ... In all of our experiments, we used α = 0.5. ... Using this approach to train a maxout network that was also regularized with dropout... ... using 1600 units per layer rather than the 240 used by the original maxout network for this problem. ... We therefore used early stopping on the adversarial validation set error. ... Five different training runs using different seeds for the random number generators used to select minibatches of training examples, initialize model weights, and generate dropout masks.