Lower bounds on the robustness to adversarial perturbations
Authors: Jonathan Peck, Joris Roels, Bart Goossens, Yvan Saeys
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally verify the bounds on the MNIST and CIFAR-10 data sets and find no violations. Additionally, the experimental results suggest that very small adversarial perturbations may occur with non-zero probability on natural samples. |
| Researcher Affiliation | Academia | Jonathan Peck1,2, Joris Roels2,3, Bart Goossens3, and Yvan Saeys1,2 1Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, 9000, Belgium 2Data Mining and Modeling for Biomedicine, VIB Inflammation Research Center, Ghent, 9052, Belgium 3Department of Telecommunications and Information Processing, Ghent University, Ghent, 9000, Belgium |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. The derivations are presented mathematically. |
| Open Source Code | No | The paper does not include an unambiguous statement or link indicating that the source code for the methodology described in the paper is openly available. |
| Open Datasets | Yes | We tested the theoretical bounds on the MNIST and CIFAR-10 test sets using the Caffe [Jia et al., 2014] implementation of Le Net [Lecun et al., 1998]. The MNIST data set [Le Cun et al., 1998] consists of 70,000 28 28 images of handwritten digits; the CIFAR-10 data set [Krizhevsky and Hinton, 2009] consists of 60,000 32 32 RGB images of various natural scenes, each belonging to one of ten possible classes. |
| Dataset Splits | No | The paper mentions the total sizes of the MNIST (70,000) and CIFAR-10 (60,000) datasets and states that 'test sets' were used, but it does not specify explicit percentages or counts for training, validation, and test splits, nor does it explicitly mention the use of a validation set. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'Caffe [Jia et al., 2014]' but does not provide specific version numbers for Caffe or any other software dependencies needed to replicate the experiments. |
| Experiment Setup | Yes | Because our method only computes norms and does not provide a way to generate actual adversarial perturbations, we used the fast gradient sign method (FGS) [Goodfellow et al., 2015] to adversarially perturb each sample in the test sets in order to assess the tightness of our theoretical bounds. FGS linearizes the cost function of the network to obtain an estimated perturbation η = εsign x L(x, θ). Here, ε > 0 is a parameter of the algorithm, L is the loss function and θ is the set of parameters of the network. The magnitudes of the perturbations found by FGS depend on the choice of ε, so we had to minimize this value in order to obtain the smallest perturbations the FGS method could supply. This was accomplished using a simple binary search for the smallest value of ε which still resulted in misclassification. As the MNIST and CIFAR-10 samples have pixel values within the range [0, 255], we upper-bounded ε by 100. |