reproducibilityindex.ai

Detecting Adversarial Examples Is (Nearly) As Hard As Classifying Them

Authors: Florian Tramer

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To illustrate, we revisit 14 empirical detector defenses published over the past years. For 12/14 defenses, we show that the claimed detection results imply an inefﬁcient classiﬁer with robustness far beyond the state-of-the-art. [...] We now survey 14 detection defenses, and consider the robust classiﬁcation performance that these defenses implicitly claim (via Theorem 4). As we will see, in 12/14 cases, the defenses detection results imply a computationally inefﬁcient classiﬁer with far better robust accuracy than the state-of-the-art.
Researcher Affiliation	Collaboration	1Google Research 2Work done while the author was at Stanford University. Correspondence to: Florian Tram er <tramer@cs.stanford.edu>.
Pseudocode	No	The paper describes algorithms in prose and bullet points, but not in a formally labeled 'Pseudocode' or 'Algorithm' block or figure.
Open Source Code	No	The paper references third-party code (e.g., 'Robustness (python library), 2019. URL https://github.com/Madry Lab/ robustness') but does not state that the code for its own methodology or analysis is open-source or provide a link to it.
Open Datasets	Yes	The 14 detector defenses use three datasets: MNIST, CIFAR-10 and Image Net, and consider adversarial examples under the ℓ or ℓ2 norms.
Dataset Splits	No	The paper mentions using adversarially-trained classifiers from other works, but it does not specify the training, validation, or test dataset splits used for reproducibility in its own analysis.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments or analysis.
Software Dependencies	No	The paper cites external software like 'Robustness (python library), 2019' but does not provide a list of its own specific ancillary software dependencies with version numbers needed to replicate its analysis.
Experiment Setup	No	The paper describes how existing detection defense claims were analyzed and contrasted with state-of-the-art robust classification, and the formula used for bounding robust risk. However, it does not provide specific hyperparameters or training configurations for a model built or trained by the authors for their analysis.