Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Are Labels Required for Improving Adversarial Robustness?
Authors: Jean-Baptiste Alayrac, Jonathan Uesato, Po-Sen Huang, Alhussein Fawzi, Robert Stanforth, Pushmeet Kohli
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On standard datasets like CIFAR10, a simple Unsupervised Adversarial Training (UAT) approach using unlabeled data improves robust accuracy by 21.7% over using 4K supervised examples alone, and captures over 95% of the improvement from the same number of labeled examples. Finally, we report an improvement of 4% over the previous state-of-the-art on CIFAR-10 against the strongest known attack by using additional unlabeled data from the uncurated 80 Million Tiny Images dataset. |
| Researcher Affiliation | Industry | Jonathan Uesato Jean-Baptiste Alayrac Po-Sen Huang Robert Stanforth Alhussein Fawzi Pushmeet Kohli Deep Mind EMAIL |
| Pseudocode | Yes | The pseudocode and implmenetation details are described in Appendix A.1. |
| Open Source Code | Yes | Our trained model is available on our repository.1 https://github.com/deepmind/deepmind-research/tree/master/unsupervised_ adversarial_training |
| Open Datasets | Yes | We run experiments on the CIFAR-10 and SVHN datasets... We use the 80 Million Tiny Images [44] dataset (hereafter, 80m) as our uncurated data source... |
| Dataset Splits | Yes | We also split out 10000 examples from the training set to use as validation, for both CIFAR-10 and SVHN, since neither dataset comes with a validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models (e.g., NVIDIA A100), CPU models, or cloud instance types used for running its experiments. |
| Software Dependencies | No | The paper mentions various techniques, models (e.g., Wide Res Net), and algorithms (PGD) but does not list specific software dependencies with their version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x). |
| Experiment Setup | Yes | We follow previous work [30, 54] for our choices of model architecture, data preprocessing, and hyperparameters, which are detailed in Appendix A. |