reproducibilityindex.ai

Explicit Tradeoffs between Adversarial and Natural Distributional Robustness

Authors: Mazda Moayeri, Kiarash Banihashem, Soheil Feizi

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We ﬁrst consider a simple linear regression setting on Gaussian data with disjoint sets of core and spurious features. In this setting, through theoretical and empirical analysis, we show that (i) adversarial training with 1 and 2 norms increases the model reliance on spurious features; (ii) For 1 adversarial training, spurious reliance only occurs when the scale of the spurious features is larger than that of the core features; (iii) adversarial training can have an unintended consequence in reducing distributional robustness, speciﬁcally when spurious correlations are changed in the new test domain. Next, we present extensive empirical evidence, using a test suite of twenty adversarially trained models evaluated on ﬁve benchmark datasets (Object Net, RIVAL10, Salient Image Net-1M, Image Net-9, Waterbirds), that adversarially trained classiﬁers rely on backgrounds more than their standardly trained counterparts, validating our theoretical results.
Researcher Affiliation	Academia	Mazda Moayeri mmoayeri@umd.edu Kiarash Banihashem kiarash@umd.edu Soheil Feizi sfeizi@cs.umd.edu Department of Computer Science University of Maryland
Pseudocode	No	The paper contains mathematical derivations and problem formulations but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statements or links indicating that source code for the methodology is openly available.
Open Datasets	Yes	We evaluate models on two backbones (Res Net18, Res Net50) adversarially trained on Image Net [12] using two norms ( 2, 1) under ﬁve attack budgets (denoted ) per norm, resulting in a 2 2 5 = 20 model test suite, as well as standardly trained baselines. We appeal to the Image Net-C [22] and Object Net [5] OOD benchmarks. We now directly quantify sensitivity to core features via RIVAL10 and Salient Image Net-1M datasets [42, 61]. Now, we take a closer look at the reliance of adversarially trained models on the contextual spurious feature of backgrounds via the synthetic datasets Image Net-9 [69] and Waterbirds [50]. We train Res Net18s on CIFAR10 [33]...
Dataset Splits	Yes	We train only a ﬁnal linear layer atop the frozen feature extractors (so that models remain adversarially robust) for each of our models on the Waterbirds training set for ten epochs, saving the model with highest validation accuracy. The test set is evenly split between these groups.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or cloud instance specifications.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup	Yes	We evaluate models on two backbones (Res Net18, Res Net50) adversarially trained on Image Net [12] using two norms ( 2, 1) under ﬁve attack budgets (denoted ) per norm, resulting in a 2 2 5 = 20 model test suite, as well as standardly trained baselines. We train only a ﬁnal linear layer atop the frozen feature extractors (so that models remain adversarially robust) for each of our models on the Waterbirds training set for ten epochs, saving the model with highest validation accuracy.