Accurate, reliable and fast robustness evaluation
Authors: Wieland Brendel, Jonas Rauber, Matthias Kümmerer, Ivan Ustyuzhaninov, Matthias Bethge
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | These findings are carefully validated across a diverse set of six different models and hold for L0, L1, L2 and L in both targeted as well as untargeted scenarios. |
| Researcher Affiliation | Academia | Wieland Brendel1,3 Jonas Rauber1-3 Matthias Kümmerer1-3 Ivan Ustyuzhaninov1-3 Matthias Bethge1,3,4 1Centre for Integrative Neuroscience, University of Tübingen 2International Max Planck Research School for Intelligent Systems 3Bernstein Center for Computational Neuroscience Tübingen 4Max Planck Institute for Biological Cybernetics wieland.brendel@bethgelab.org |
| Pseudocode | Yes | Algorithm 1: Schematic of our attacks. |
| Open Source Code | No | Implementations will soon be available in all major toolboxes (Foolbox, Clever Hans and ART). |
| Open Datasets | Yes | Madry-MNIST [Madry et al., 2018]: Adversarially trained model on MNIST. Madry-CIFAR [Madry et al., 2018]: Adversarially trained model on CIFAR-10. Res Net-50 [He et al., 2016]: Standard vanilla Res Net-50 model trained on Image NET |
| Dataset Splits | Yes | All results reported have been evaluated on 1000 validation samples. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Foolbox [Rauber et al., 2017]' but does not provide specific version numbers for Foolbox or any other software dependencies crucial for replication. |
| Experiment Setup | Yes | We conducted a large-scale hyperparameter tuning for each attack. In the appendix we list all hyperparameters and hyperparameter ranges for each attack. For the L evaluation, we chose ϵ for each model and each attack scenario such that the best attack performance reaches roughly 50% accuracy. |