Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope

Authors: Eric Wong, Zico Kolter

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate the approach on a number of tasks to train classifiers with robust adversarial guarantees (e.g. for MNIST, we produce a convolutional classifier that provably has less than 5.8% test error for any adversarial attack with bounded ℓ norm less than ϵ = 0.1), and code for all experiments is available at http://github.com/ locuslab/convex_adversarial.
Researcher Affiliation Academia 1Machine Learning Department, Carnegie Mellon University, Pittsburgh PA, 15213, USA 2Computer Science Department, Carnegie Mellon University, Pittsburgh PA, 15213, USA. Correspondence to: Eric Wong <ericwong@cs.cmu.edu>, J. Zico Kolter <zkolter@cs.cmu.edu>.
Pseudocode Yes Algorithm 1 Computing Activation Bounds
Open Source Code Yes code for all experiments is available at http://github.com/ locuslab/convex_adversarial.
Open Datasets Yes We evaluate our approach on classification tasks such as human activity recognition, MNIST digit classification, Fashion MNIST , and street view housing numbers. ... Fashion-MNIST dataset (Xiao et al., 2017), a harder dataset with the same size (in dimension and number of examples) as MNIST and human activity recognition dataset (Anguita et al., 2013).
Dataset Splits No The paper mentions training and testing but does not provide specific details on train/validation/test splits (e.g., percentages or sample counts) for the datasets used.
Hardware Specification Yes All experiments were run on a single Titan X GPU.
Software Dependencies No The paper mentions using a standard stochastic gradient variant and autodiff toolkit but does not specify any software names with version numbers.
Experiment Setup Yes The final classifier after 100 epochs reaches a test error of 1.80% with a robust test error of 5.82%.