reproducibilityindex.ai

Accurate, reliable and fast robustness evaluation

Authors: Wieland Brendel, Jonas Rauber, Matthias Kümmerer, Ivan Ustyuzhaninov, Matthias Bethge

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	These findings are carefully validated across a diverse set of six different models and hold for L0, L1, L2 and L in both targeted as well as untargeted scenarios.
Researcher Affiliation	Academia	Wieland Brendel1,3 Jonas Rauber1-3 Matthias Kümmerer1-3 Ivan Ustyuzhaninov1-3 Matthias Bethge1,3,4 1Centre for Integrative Neuroscience, University of Tübingen 2International Max Planck Research School for Intelligent Systems 3Bernstein Center for Computational Neuroscience Tübingen 4Max Planck Institute for Biological Cybernetics wieland.brendel@bethgelab.org
Pseudocode	Yes	Algorithm 1: Schematic of our attacks.
Open Source Code	No	Implementations will soon be available in all major toolboxes (Foolbox, Clever Hans and ART).
Open Datasets	Yes	Madry-MNIST [Madry et al., 2018]: Adversarially trained model on MNIST. Madry-CIFAR [Madry et al., 2018]: Adversarially trained model on CIFAR-10. Res Net-50 [He et al., 2016]: Standard vanilla Res Net-50 model trained on Image NET
Dataset Splits	Yes	All results reported have been evaluated on 1000 validation samples.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions 'Foolbox [Rauber et al., 2017]' but does not provide specific version numbers for Foolbox or any other software dependencies crucial for replication.
Experiment Setup	Yes	We conducted a large-scale hyperparameter tuning for each attack. In the appendix we list all hyperparameters and hyperparameter ranges for each attack. For the L evaluation, we chose ϵ for each model and each attack scenario such that the best attack performance reaches roughly 50% accuracy.