Robustness and Accuracy Could Be Reconcilable by (Proper) Definition
Authors: Tianyu Pang, Min Lin, Xiao Yang, Jun Zhu, Shuicheng Yan
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Sec. 5, we validate the effectiveness of replacing KL divergence with distance-based metrics (and their variants), developed from the analyses of SCORE. We improve the state-of-the-art AT methods under Auto Attack (Croce and Hein, 2020), and achieve top-rank performance with 1M DDPM generated data on the leader boards of CIFAR-10 and CIFAR-100 on Robust Bench (Croce et al., 2020). |
| Researcher Affiliation | Collaboration | 1Dept. of Comp. Sci. and Tech., Institute for AI, BNRist Center, THBI Lab, Tsinghua-Bosch Joint Center for ML, Tsinghua University. 2Sea AI Lab, Singapore. |
| Pseudocode | No | No pseudocode or algorithm blocks were found. |
| Open Source Code | Yes | Code is at https://github.com/P2333/SCORE. |
| Open Datasets | Yes | We improve the state-of-the-art AT methods under Auto Attack (Croce and Hein, 2020), and achieve top-rank performance with 1M DDPM generated data on the leader boards of CIFAR-10 and CIFAR-100 on Robust Bench (Croce et al., 2020). |
| Dataset Splits | Yes | For our methods, we report the results on the checkpoint with the highest value of PGD-10 (SE) accuracy on a separate validation set, similarly to Rice et al. (2020). |
| Hardware Specification | No | The paper mentions using 'large models' and notes 'limited computational resources' but does not provide specific hardware details such as GPU or CPU models used for experiments. |
| Software Dependencies | No | The paper mentions 'Py Torch implementation' and 'SGD momentum optimizer' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | In training, we use SGD momentum optimizer with batch size 128 and weight decay 5e-4. We exploit the PGD-AT (Madry et al., 2018) and TRADES (Zhang et al., 2019) frameworks. The training attack used is 10-steps PGD with step size α = 2/255 for ℓ∞ threat model and α = 16/255 for ℓ2 threat model. The training runs for 110 epochs with the learning rate decaying by a factor of 0.1 at the 100 and 105 epoch, respectively. The hyperparameter β = 6 in the TRADES experiments. |