reproducibilityindex.ai

Unlabeled Data Improves Adversarial Robustness

Authors: Yair Carmon, Aditi Raghunathan, Ludwig Schmidt, John C. Duchi, Percy S. Liang

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate, theoretically and empirically, that adversarial robustness can signiﬁcantly beneﬁt from semisupervised learning. ... The ﬁrst part of our paper is theoretical ... Our theoretical ﬁndings motivate the second, empirical part of our paper, where we test the effect of unlabeled data and self-training on standard adversarial robustness benchmarks.
Researcher Affiliation	Academia	Yair Carmon Stanford University yairc@stanford.edu Aditi Raghunathan* Stanford University aditir@stanford.edu Ludwig Schmidt UC Berkeley ludwig@berkeley.edu Percy Liang Stanford University pliang@cs.stanford.edu John C. Duchi Stanford University jduchi@stanford.edu
Pseudocode	Yes	Meta-Algorithm 1 Robust self-training Input: Labeled data (x1,y1,...,xn,yn) and unlabeled data ( x1,..., x n) Parameters: Standard loss Lstandard, robust loss Lrobust and unlabeled weight w 1: Learn ˆ intermediate by minimizing Lstandard( ,xi,yi) 2: Generate pseudo-labels yi =fˆ intermediate( xi) for i=1,2,... n 3: Learn ˆ ﬁnal by minimizing Lrobust( ,xi,yi)+w Lrobust( , xi, yi)
Open Source Code	Yes	Code and data are available on Git Hub at https://github.com/yaircarmon/semisup-adv and on Coda Lab at https://bit.ly/349Ws AC.
Open Datasets	Yes	For CIFAR-10 [22], we obtain 500K unlabeled images by mining the 80 Million Tiny Images dataset [46]... The SVHN dataset [53] is naturally split into a core training set of about 73K images and an extra training set with about 531K easier images.
Dataset Splits	No	No explicit statements regarding training/validation/test dataset splits (e.g., percentages or exact counts for all three) were found in the main text. It mentions using 'CIFAR-10 training set' and 'SVHN training data' and evaluating on 'test examples', but not the full split details.
Hardware Specification	No	No specific hardware details (e.g., CPU, GPU models, or cloud computing specifications) used for running experiments are provided in the paper.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9) are explicitly mentioned in the main text for replicating the experiments.
Experiment Setup	Yes	For adversarial training, we compute x PG exactly as in [56] with = 8/255, and denote the resulting model as RSTadv(50K+500K). For stability training, we set the additive noise variance to to σ=0.25 and denote the result RSTstab(50K+500K). We use a Wide Res Net 28-10 architecture for both the intermediate pseudo-label generator and ﬁnal robust model.