Robustness and Accuracy Could Be Reconcilable by (Proper) Definition

Authors: Tianyu Pang, Min Lin, Xiao Yang, Jun Zhu, Shuicheng Yan

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Sec. 5, we validate the effectiveness of replacing KL divergence with distance-based metrics (and their variants), developed from the analyses of SCORE. We improve the state-of-the-art AT methods under Auto Attack (Croce and Hein, 2020), and achieve top-rank performance with 1M DDPM generated data on the leader boards of CIFAR-10 and CIFAR-100 on Robust Bench (Croce et al., 2020).
Researcher Affiliation Collaboration 1Dept. of Comp. Sci. and Tech., Institute for AI, BNRist Center, THBI Lab, Tsinghua-Bosch Joint Center for ML, Tsinghua University. 2Sea AI Lab, Singapore.
Pseudocode No No pseudocode or algorithm blocks were found.
Open Source Code Yes Code is at https://github.com/P2333/SCORE.
Open Datasets Yes We improve the state-of-the-art AT methods under Auto Attack (Croce and Hein, 2020), and achieve top-rank performance with 1M DDPM generated data on the leader boards of CIFAR-10 and CIFAR-100 on Robust Bench (Croce et al., 2020).
Dataset Splits Yes For our methods, we report the results on the checkpoint with the highest value of PGD-10 (SE) accuracy on a separate validation set, similarly to Rice et al. (2020).
Hardware Specification No The paper mentions using 'large models' and notes 'limited computational resources' but does not provide specific hardware details such as GPU or CPU models used for experiments.
Software Dependencies No The paper mentions 'Py Torch implementation' and 'SGD momentum optimizer' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes In training, we use SGD momentum optimizer with batch size 128 and weight decay 5e-4. We exploit the PGD-AT (Madry et al., 2018) and TRADES (Zhang et al., 2019) frameworks. The training attack used is 10-steps PGD with step size α = 2/255 for ℓ∞ threat model and α = 16/255 for ℓ2 threat model. The training runs for 110 epochs with the learning rate decaying by a factor of 0.1 at the 100 and 105 epoch, respectively. The hyperparameter β = 6 in the TRADES experiments.