Doubly Robust Instance-Reweighted Adversarial Training
Authors: Daouda Sow, Sen Lin, Zhangyang Wang, Yingbin Liang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on standard classification datasets demonstrate that our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance, and at the same time improves the robustness against attacks on the weakest data points. 4 EXPERIMENTS |
| Researcher Affiliation | Academia | Daouda A. Sow Department of ECE The Ohio State University sow.53@osu.edu Sen Lin Department of CS University of Houston slin50@central.uh.edu Zhangyang Wang Visual Informatics Group University of Texas at Austin atlaswang@utexas.edu Yingbin Liang Department of ECE The Ohio State University liang.889@osu.edu |
| Pseudocode | Yes | Algorithm 1 Compositional Implicit Differentiation (CID) |
| Open Source Code | Yes | Pytorch codes for our method are provided in the supplementary material of our submission. |
| Open Datasets | Yes | We consider image classification problems and compare the performance of the baselines on four datasets: CIFAR10 Krizhevsky & Hinton (2009), SVHN Netzer et al. (2011), STL10 Coates et al. (2011), and GTSRB Stallkamp et al. (2012). |
| Dataset Splits | Yes | All hyperparameters were fixed by holding out 10% of the training data as a validation set and selecting the values that achieve the best performance on the validation set. ...For CIFAR10, SVHN, and STL10 we use the training and test splits provided by Torchvision. |
| Hardware Specification | Yes | We run all baselines on a single NVIDIA Tesla V100 GPU. |
| Software Dependencies | Yes | All codes are tested with Python 3.7 and Pytorch 1.8. |
| Experiment Setup | Yes | More details about the training and hyperparameters search can be found in Appendix B. ...we train our baselines using stochastic gradient descent with a minibtach size of 128 and a momentum of 0.9. We use Res Net-18 as the backbone network as in Madry et al. (2017) and train our baselines for 60 epochs with a cyclic learning rate schedule where the maximum learning rate is set to 0.2 ...For the KL-divergence regularization parameter r in our algorithms, we use a decayed schedule where we initially set it to 10 and decay it to 1 and 0.1, respectively at epochs 40 and 50 (see fig. 2). |