Adversarial Manipulation of Deep Representations

Authors: Sara Sabour, Yanshuai Cao, Fartash Faghri, David Fleet

ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that the image representations in a deep neural network (DNN) can be manipulated to mimic those of other natural images, with only minor, imperceptible perturbations to the original image. Previous methods for generating adversarial images focused on image perturbations designed to produce erroneous class labels. Here we instead concentrate on the internal layers of DNN representations, to produce a new class of adversarial images that differs qualitatively from others.
Researcher Affiliation Collaboration Sara Sabour 1, Yanshuai Cao 1,2, Fartash Faghri1,2 & David J. Fleet1 1 Department of Computer Science, University of Toronto, Canada 2 Architech Labs, Toronto, Canada
Pseudocode No The paper describes the optimization process mathematically and refers to the optimization method (l-BFGS-b), but does not include explicit pseudocode or an algorithm block.
Open Source Code No The paper does not contain any explicit statements about code release or links to source code repositories.
Open Datasets Yes Imagenet ILSVRC data (Deng et al., 2009)
Dataset Splits Yes We test on a dataset comprising over 20,000 source-guide pairs, sampled from training, test and validation sets of ILSVRC, plus some images from Wikipedia to increase diversity.
Hardware Specification No The paper mentions training on various networks (Caffenet, Alex Net, Google Net, VGG CNN-S) but does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for the experiments.
Software Dependencies No The paper mentions using specific DNN models like “BVLC Caffenet Reference model,” “Alex Net,” “Google Net,” and “VGG CNN-S,” as well as “l-BFGS-b” for optimization. However, it does not specify version numbers for these software components or any other libraries, making it difficult to precisely reproduce the software environment.
Experiment Setup Yes We use l-BFGS-b, with the inequality (2) expressed as a box constraint around Is. Rather than optimizing δ for each image, we find that a fixed value of δ = 10 (out of 255) produces compelling adversarial images with negligible perceptual distortion. The optimization is done for a maximum of 500 iterations, with δ = 10.