Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Generating Less Certain Adversarial Examples Improves Robust Generalization
Authors: Minxing Zhang, Michael Backes, Xiao Zhang
TMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on image benchmarks demonstrate that our method effectively learns models with consistently improved robustness and mitigates robust overfitting, confirming the importance of generating less certain adversarial examples for robust generalization. Our implementations are available as open-source code at: https://github.com/Trust MLRG/Adv Certainty. This section examines the performance of our DAC method under ℓ perturbations with ϵ = 8/255 on various model architectures, including Pre Act Res Net-18, denoted as PRN18, and Wide Res Net-34, denoted as WRN34. And we train a model for 200 epochs using SGD with a momentum of 0.9. |
| Researcher Affiliation | Academia | Minxing Zhang EMAIL CISPA Helmholtz Center for Information Security Michael Backes EMAIL CISPA Helmholtz Center for Information Security Xiao Zhang EMAIL CISPA Helmholtz Center for Information Security |
| Pseudocode | No | The paper describes the DAC method in Section 5, 'Decreasing Adversarial Certainty Helps Robust Generalization'. It formulates the problem with mathematical equations and describes the two-step optimization in Equation (3). However, it is not presented in a clearly labeled pseudocode block or algorithm box. |
| Open Source Code | Yes | Our implementations are available as open-source code at: https://github.com/Trust MLRG/Adv Certainty. |
| Open Datasets | Yes | Extensive experiments on image benchmark datasets, we demonstrate that our method consistently produces more robust models when combined with various adversarial training algorithms, and robust overfitting is significantly mitigated with the involvement of DAC (Section 6.1). Moreover, we find that our proposed adversarial certainty has an implicit effect on existing robustness-enhancing techniques that are even designed based on different insights (Section 6.2). Besides, we provide a more intuitive demonstration of DAC s efficacy (Section 6.3), and update the explicit optimization of adversarial certainty by using a regularization term to improve the efficiency (Section 6.4). These empirical results again indicate the importance of adversarial certainty in understanding adversarial training and bring a further comprehension of our work. ... three widely-used benchmark datasets: CIFAR-10 Krizhevsky & Hinton (2009), CIFAR-100 Krizhevsky & Hinton (2009) and SVHN Netzer et al. (2011) |
| Dataset Splits | Yes | We evaluate the effectiveness of our DAC method in improving robust generalization on three widely-used benchmark datasets: CIFAR-10 Krizhevsky & Hinton (2009), CIFAR-100 Krizhevsky & Hinton (2009) and SVHN Netzer et al. (2011) |
| Hardware Specification | Yes | For instance, for a PRN18 model of AT and CIFAR-10 on a single NVIDIA A100 GPU, DAC averagely costs 143s for each training epoch while DAC_Reg costs 80s. |
| Software Dependencies | No | The paper mentions 'SGD' (Stochastic Gradient Descent) and 'PGD' (Projected Gradient Descent) as optimization algorithms and attack methods, respectively. However, it does not specify any software libraries or frameworks (e.g., PyTorch, TensorFlow) with version numbers that would be required to reproduce the experiments. |
| Experiment Setup | Yes | And we train a model for 200 epochs using SGD with a momentum of 0.9. Besides, the initial learning rate is 0.1, and is divided by 10 at the 100-th epoch and at the 150-th epoch. The adversarial attack used in training is PGD-10 with a step size of 1/255 for SVHN, and 2/255 for CIFAR-10 and CIFAR-100, while we utilize the commonly-used attack benchmarks of PGD-20 Madry et al. (2018), PGD-100 Madry et al. (2018), CW Carlini & Wagner (2017) and Auto Attack Croce & Hein (2020) for evaluation. |