reproducibilityindex.ai

Robustness May Be at Odds with Accuracy

Authors: Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, Aleksander Madry

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical examination In Section 2.1, we showed that the trade-off between standard accuracy and robustness might be inevitable. To examine how representative our theoretical model is of real-world datasets, we also experimentally investigate this issue on MNIST (Le Cun et al., 1998) as it is amenable to linear classiﬁers.
Researcher Affiliation	Academia	Dimitris Tsipras , Shibani Santurkar , Logan Engstrom , Alexander Turner, Aleksander M adry Massachusetts Institute of Technology {tsipras,shibani,engstrom,turneram,madry}@mit.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	MNIST: We use the simple convolution architecture from the Tensor Flow tutorial (TFM, 2017) 3. CIFAR-10: We consider a standard Res Net model (He et al., 2015a). It has 4 groups of residual layers with ﬁlter sizes (16, 16, 32, 64) and 5 residual units each 4. Footnote 3: https://github.com/Madry Lab/mnist_challenge/ Footnote 4: https://github.com/Madry Lab/cifar10_challenge/
Open Datasets	Yes	We perform our experimental analysis on the MNIST (Le Cun et al., 2010), CIFAR-10 (Krizhevsky & Hinton, 2009) and (restricted) Image Net (Deng et al., 2009) datasets.
Dataset Splits	No	The paper presents results for "Standard accuracy (train)" and "Standard accuracy (test)" in figures and tables, but does not explicitly specify a validation set split or how it was used.
Hardware Specification	No	The paper does not provide specific details about the hardware used, such as GPU models, CPU types, or memory specifications. It only mentions model architectures like ResNet.
Software Dependencies	No	The paper mentions using a 'Tensor Flow tutorial' and 'tensorpack repository' for model architectures but does not specify software dependencies with version numbers (e.g., TensorFlow version, Python version, specific libraries with their versions).
Experiment Setup	Yes	Table 2: Value of ε used for adversarial training/evaluation of each dataset and ℓp-norm. A.3 ADVERSARIAL TRAINING: We perform adversarial training to train robust classiﬁers following Madry et al. (2017). Speciﬁcally, we train against a projected gradient descent (PGD) adversary, starting from a random initial perturbation of the training data. For Binary MNIST: We use the cross-entropy loss and perform 100 epochs of gradient descent in training.