reproducibilityindex.ai

Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples

Authors: Guanhong Tao, Shiqing Ma, Yingqi Liu, Xiangyu Zhang

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Results show that our technique can achieve 94% detection accuracy for 7 different kinds of attacks with 9.91% false positives on benign inputs. In contrast, a state-of-the-art feature squeezing technique can only achieve 55% accuracy with 23.3% false positives. We use one of the most widely used FRSes, VGG-Face [19] to demonstrate effectiveness of Am I. Three datasets, VGG Face dataset (VF) [18], Labeled Faces in the Wild (LFW) [33] and Celeb Faces Attributes dataset (Celeb A) [34] are employed.
Researcher Affiliation	Academia	Guanhong Tao , Shiqing Ma , Yingqi Liu, Xiangyu Zhang Department of Computer Science, Purdue University {taog, ma229, liu1751, xyzhang}@cs.purdue.edu
Pseudocode	No	The paper describes its method in prose and mathematical equations but does not include any explicit pseudocode blocks or algorithms.
Open Source Code	Yes	Am I is available at Git Hub [25]. [25] Am IAttribute. Am IAttribute/Am I. https://github.com/AmIAttribute/AmI, 2018.
Open Datasets	Yes	Three datasets, VGG Face dataset (VF) [18], Labeled Faces in the Wild (LFW) [33] and Celeb Faces Attributes dataset (Celeb A) [34] are employed. We use a small subset of the VF dataset (10 images) to extract attribute witnesses... We use 2000 training images from the VF set (1000 with the attribute and 1000 without the attribute) to train the model.
Dataset Splits	Yes	ϵ and β are set to 1.15 and 60, respectively in this paper. They are chosen through a tuning set of 100 benign images, which has no overlap with the test set.
Hardware Specification	No	The paper does not specify any details about the hardware (e.g., GPU, CPU models, memory) used for running the experiments.
Software Dependencies	Yes	We use the Clever Hans library [36] to generate untargeted attacks FGSM and BIM. [36] Nicolas Papernot, Nicholas Carlini, Ian Goodfellow, Reuben Feinman, Fartash Faghri, Alexander Matyasko, Karen Hambardzumyan, Yi-Lin Juang, Alexey Kurakin, Ryan Sheatsley, et al. Clever Hans v2.0.0: An Adversarial Machine Learning Library. ar Xiv preprint ar Xiv:1610.00768, 2016.
Experiment Setup	Yes	α deﬁnes the magnitude of weakening, which is set to 100 in this paper. ϵ and β are set to 1.15 and 60, respectively in this paper.