Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Improving Robustness using Generated Data
Authors: Sven Gowal, Sylvestre-Alvise Rebuffi, Olivia Wiles, Florian Stimberg, Dan Andrei Calian, Timothy A Mann
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on CIFAR-10, CIFAR-100, SVHN and TINYIMAGENET against β and β2 norm-bounded perturbations of size Ο΅ = 8/255 and Ο΅ = 128/255, respectively. We show large absolute improvements in robust accuracy compared to previous state-of-the-art methods. |
| Researcher Affiliation | Industry | Sven Gowal*, Sylvestre-Alvise Rebufο¬*, Olivia Wiles, Florian Stimberg, Dan Calian and Timothy Mann Deep Mind, London EMAIL |
| Pseudocode | No | The information provided does not contain a structured pseudocode or algorithm block. The overall approach is described verbally and summarized in Figure 2, but not as a formal algorithmic listing. |
| Open Source Code | Yes | The training and evaluation code is available at https://github.com/deepmind/deepmind-research/tree/master/ adversarial_robustness. |
| Open Datasets | Yes | We use the CIFAR-10 and CIFAR-100 datasets [42], as well as SVHN [53] and TINYIMAGENET [26]. |
| Dataset Splits | Yes | We perform hyper-parameter tuning on a held-out validation set. |
| Hardware Specification | Yes | All experiments are run on NVIDIA A100 GPUs. |
| Software Dependencies | No | The information provided mentions software used ('The models are implemented in Jax [6] and Haiku [35]') but does not specify version numbers for these or any other key software components, which is required for reproducible description of ancillary software. |
| Experiment Setup | Yes | We use stochastic weight averaging [38] with a decay rate of 0.995. For adversarial training, we use TRADES [82] with 10 Projected Gradient Descent (PGD) steps. We train for 400 CIFAR-10-equivalent epochs with a batch size of 1024 (i.e., 19K steps). |