Thermometer Encoding: One Hot Way To Resist Adversarial Examples

Authors: Jacob Buckman, Aurko Roy, Colin Raffel, Ian Goodfellow

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate this robustness with experiments on the MNIST, CIFAR-10, CIFAR-100, and SVHN datasets, and show that models with thermometer-encoded inputs consistently have higher accuracy on adversarial examples, without decreasing generalization. State-of-the-art accuracy under the strongest known white-box attack was increased from 93.20% to 94.30% on MNIST and 50.00% to 79.16% on CIFAR-10.
Researcher Affiliation Industry Jacob Buckman , Aurko Roy , Colin Raffel, Ian Goodfellow Google Brain Mountain View, CA {buckman, aurkor, craffel, goodfellow}@google.com
Pseudocode Yes The DGA attack is described in Algorithm 2 and the LS-PGA attack is described in Algorithm 3. Both these algorithms make use of a get Mask() sub-routine which is described in Algorithm 1.
Open Source Code No Source code is available at http://anonymized
Open Datasets Yes We demonstrate this robustness with experiments on the MNIST, CIFAR-10, CIFAR-100, and SVHN datasets
Dataset Splits No The paper mentions using MNIST, CIFAR-10, CIFAR-100, and SVHN datasets and refers to
Hardware Specification No The paper describes network architectures (e.g., Wide Res Net) and experimental settings, but does not provide specific details on the hardware (e.g., GPU/CPU models) used for training or inference.
Software Dependencies No The paper mentions optimizers (Adam, Momentum) and models (Wide Res Net, convolutional network) but does not list specific software dependencies with version numbers (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup Yes For our MNIST experiments, we use a convolutional network; for CIFAR-10, CIFAR-100, and SVHN we use a Wide Res Net (Zagoruyko & Komodakis, 2016). We use a network of depth 30 for the CIFAR10 and CIFAR-100 datasets, while for SVHN we use a network of depth 15. The width factor of all the Wide Res Nets is set to k = 4. 1 Unless otherwise specified, all quantized and discretized models use 16 levels. For our MNIST experiments, we use a convolutional network; for CIFAR-10, CIFAR-100, and SVHN we use a Wide Res Net (Zagoruyko & Komodakis, 2016). We use a network of depth 30 for the CIFAR10 and CIFAR-100 datasets, while for SVHN we use a network of depth 15. The width factor of all the Wide Res Nets is set to k = 4. We found that in all cases, LS-PGA was strictly more powerful than DGA, so all attacks on discretized models use LS-PGA with ξ = 0.01, δ = 1.2, and 1 random restart. To be consistent with Madry et al. (2017), we describe attacks in terms of the maximum ℓ -norm of the attack, ε. All MNIST experiments used ε = 0.3 and 40 steps for iterative attacks; experiments on CIFAR used ε = 0.031 and 7 steps for iterative attacks; experiments on SVHN used ε = 0.047 and 10 steps for iterative attacks. These settings were used for adversarial training, white-box attacks, and blackbox attacks. For MNIST we use the Adam optimizer with a fixed learning rate of 1e 4 as in Madry et al. (2017). For CIFAR-10 and CIFAR-100 we use the Momentum optimizer with momentum 0.9, ℓ2 weight decay of λ = 0.0005 and an initial learning rate of 0.1 which is annealed by a factor of 0.2 after epochs 60, 120 and 160 respectively as in Zagoruyko & Komodakis (2016). For SVHN we use the same optimizer with initial learning rate of 1e 2 which is annealed by a factor of 0.1 after epochs 80 and 120 respectively. We also use a dropout of 0.3 for CIFAR-10, CIFAR-100 and SVHN.