reproducibilityindex.ai

Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks

Authors: David Stutz, Matthias Hein, Bernt Schiele

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate CCAT in comparison with AT (Madry et al., 2018) and related work (Maini et al., 2020; Zhang et al., 2019) on MNIST (Le Cun et al., 1998), SVHN (Netzer et al., 2011) and Cifar10 (Krizhevsky, 2009) as well as MNIST-C (Mu & Gilmer, 2019) and Cifar10-C (Hendrycks & Dietterich, 2019) with corrupted examples (e.g., blur, noise, compression, transforms etc.). We report conﬁdencethresholded test error (Err; lower is better) and conﬁdencethresholded robust test error (RErr; lower is better) for a conﬁdence-threshold τ corresponding to 99% true positive rate (TPR); we omit τ for brevity.
Researcher Affiliation	Academia	1Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbr ucken 2University of T ubingen, T ubingen. Correspondence to: David Stutz <david.stutz@mpi-inf.mpg.de>.
Pseudocode	Yes	Algorithm 1 Conﬁdence-Calibrated Adversarial Training (CCAT). The only changes compared to standard adversarial training are the attack (line 4) and the probability distribution over the classes (lines 6 and 7), which becomes more uniform as distance δ increases. During testing, low-conﬁdence (adversarial) examples are rejected.
Open Source Code	Yes	We make our code (training and evaluation) and pre-trained models publicly available at davidstutz.de/ccat.
Open Datasets	Yes	We evaluate CCAT in comparison with AT (Madry et al., 2018) and related work (Maini et al., 2020; Zhang et al., 2019) on MNIST (Le Cun et al., 1998), SVHN (Netzer et al., 2011) and Cifar10 (Krizhevsky, 2009) as well as MNIST-C (Mu & Gilmer, 2019) and Cifar10-C (Hendrycks & Dietterich, 2019) with corrupted examples (e.g., blur, noise, compression, transforms etc.).
Dataset Splits	Yes	Err is computed on 9000 test examples. RErr is computed on 1000 test examples. The conﬁdence threshold τ depends only on correctly classiﬁed clean examples and is ﬁxed at 99%TPR on the held-out last 1000 test examples.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running experiments.
Software Dependencies	No	The paper mentions 'implemented in Py Torch (Paszke et al., 2017)' but does not specify a version number for PyTorch or any other relevant software dependencies.
Experiment Setup	Yes	Training: We train 50%/50% AT (AT-50%) and CCAT as well as 100% AT (AT-100%) with L attacks using T = 40 iterations for PGD-CE and PGD-Conf, respectively, and ϵ = 0.3 (MNIST) or ϵ = 0.03 (SVHN/Cifar10). We use Res Net-20 (He et al., 2016), implemented in Py Torch (Paszke et al., 2017), trained using stochastic gradient descent. For CCAT, we use ρ = 10.