Fantastic Robustness Measures: The Secrets of Robust Generalization
Authors: Hoki Kim, Jinseong Park, Yujin Choi, Jaewook Lee
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this study, we present a large-scale analysis of robust generalization to empirically verify whether the relationship between these measures and robust generalization remains valid in diverse settings. We demonstrate when and how these measures effectively capture the robust generalization gap by comparing over 1,300 models trained on CIFAR-10 under the L norm and further validate our findings through an evaluation of more than 100 models from Robust Bench [12] across CIFAR-10, CIFAR-100, and Image Net. |
| Researcher Affiliation | Academia | Hoki Kim Seoul National University ghrl9613@snu.ac.kr Jinseong Park Seoul National University jinseong@snu.ac.kr Yujin Choi Seoul National University uznhigh@snu.ac.kr Jaewook Lee Seoul National University jaewook@snu.ac.kr |
| Pseudocode | No | The paper includes mathematical equations for measures and descriptions of procedures, but no explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | To promote reproducibility and transparency in the field of deep learning, we have integrated the code used in this paper, along with pre-trained models, accessible to the public at https: //github.com/Harry24k/MAIR. |
| Open Datasets | Yes | We train over 1,300 models on CIFAR-10 under the L norm across various training settings. To further validate our findings, we also analyze over 100 models provided in Robust Bench [12] across CIFAR-10, CIFAR-100, and Image Net. |
| Dataset Splits | No | The paper mentions 'training data' and 'test datasets' but does not specify a separate 'validation' split or its proportions. Section 3.1 states: '1,344 models were trained using the CIFAR-10 dataset with ϵ = 8/255. Given these models, we evaluate their train/test robustness against PGD with 10 iterations (denoted as PGD10).' |
| Hardware Specification | No | The paper mentions |
| Software Dependencies | No | The paper cites 'Pytorch: An imperative style, high-performance deep learning library.' [49] in its references, indicating its use, but it does not specify version numbers for PyTorch or any other software dependencies within the experimental setup description. |
| Experiment Setup | Yes | Based on these prior works, to mimic practical scenarios, we have carefully selected eight training parameters widely used for improving robust generalization: (1) Model architecture {Res Net18 [23], WRN28-10 [70], WRN34-10 [70]}, (2) Training methods {Standard, AT [42], TRADES [71], MART [61]}, (3) Inner maximization steps {1, 10}, (4) Optimizer {SGD, AWP [64]}, (5) Batch-size {32, 64, 128, 256}, (6) Data augmentation {No Augmentation, Use crop and flip}, (7) Extra-data {No extra data, Use extra data [9]}, and (8) Early-stopping {No early-stopping, Use early-stopping}. |