Accurate, reliable and fast robustness evaluation

Authors: Wieland Brendel, Jonas Rauber, Matthias Kümmerer, Ivan Ustyuzhaninov, Matthias Bethge

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental These findings are carefully validated across a diverse set of six different models and hold for L0, L1, L2 and L in both targeted as well as untargeted scenarios.
Researcher Affiliation Academia Wieland Brendel1,3 Jonas Rauber1-3 Matthias Kümmerer1-3 Ivan Ustyuzhaninov1-3 Matthias Bethge1,3,4 1Centre for Integrative Neuroscience, University of Tübingen 2International Max Planck Research School for Intelligent Systems 3Bernstein Center for Computational Neuroscience Tübingen 4Max Planck Institute for Biological Cybernetics wieland.brendel@bethgelab.org
Pseudocode Yes Algorithm 1: Schematic of our attacks.
Open Source Code No Implementations will soon be available in all major toolboxes (Foolbox, Clever Hans and ART).
Open Datasets Yes Madry-MNIST [Madry et al., 2018]: Adversarially trained model on MNIST. Madry-CIFAR [Madry et al., 2018]: Adversarially trained model on CIFAR-10. Res Net-50 [He et al., 2016]: Standard vanilla Res Net-50 model trained on Image NET
Dataset Splits Yes All results reported have been evaluated on 1000 validation samples.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'Foolbox [Rauber et al., 2017]' but does not provide specific version numbers for Foolbox or any other software dependencies crucial for replication.
Experiment Setup Yes We conducted a large-scale hyperparameter tuning for each attack. In the appendix we list all hyperparameters and hyperparameter ranges for each attack. For the L evaluation, we chose ϵ for each model and each attack scenario such that the best attack performance reaches roughly 50% accuracy.