reproducibilityindex.ai

Revisiting and Advancing Fast Adversarial Training Through The Lens of Bi-Level Optimization

Authors: Yihua Zhang, Guanhua Zhang, Prashant Khanduri, Mingyi Hong, Shiyu Chang, Sijia Liu

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In practice, we show our method yields substantial robustness improvements over baselines across multiple models and datasets. All experiments are run on a single Ge Force RTX 3090 GPU.
Researcher Affiliation	Collaboration	1 Michigan State University 2 UC Santa Barbara 3 University of Minnesota 4MIT-IBM Watson AI Lab, IBM Research.
Pseudocode	Yes	FAST-BAT algorithm Lower-level solution: Obtain δ (θt) from (8); δ (θ) = PC (z (1/λ) δ=zℓatk(θ, δ)) . Upper-level model training: Integrating the IG (12) into (4), call SGD to update the model parameters as: θt+1 = θt α1 θℓtr(θt, δ ) λ) θδℓatk(θt, δ )HC δℓtr(θt, δ ), (13) where α1, α2 > 0 are learning rates associated with the model gradient and the IG-augmented descent direction.
Open Source Code	Yes	Codes are available at https:// github.com/OPTML-Group/Fast-BAT.
Open Datasets	Yes	We will evaluate the effectiveness of our proposal under CIFAR-10 (Krizhevsky & Hinton, 2009), CIFAR-100 (Krizhevsky & Hinton, 2009), Tiny-Image Net (Deng et al., 2009), and Image Net (Deng et al., 2009).
Dataset Splits	Yes	Minor performance degradation compared to the results reported in the original papers is due to different training setting, that we train all the methods with only 90% of training data, choose the best model on the validation set (10% training data), and evaluate on the test set.
Hardware Specification	Yes	All experiments are run on a single Ge Force RTX 3090 GPU.
Software Dependencies	No	The paper mentions software components like 'SGD optimizer' and 'cyclic scheduler' and implies the use of frameworks (e.g., PyTorch from typical machine learning setups), but it does not specify any version numbers for these software dependencies, which is required for reproducibility.
Experiment Setup	Yes	Training details. We choose the training perturbation strength = {8, 16}/255 for CIFAR-10, CIFAR-100, and Tiny-Image Net; and = 2/255 for Image Net following (Wong et al., 2020; Andriushchenko & Flammarion, 2020). Throughout the experiments, we utilize an SGD optimizer with a momentum of 0.9 and weight decay of 5 10 4. For CIFAR-10, CIFAR-100 and Tiny-Image Net, we train each model for 20 epochs in total, where we use the cyclic scheduler to adjust the learning rate. The learning rate linearly ascends from 0 to 0.2 within the first 10 epochs and then reduces to 0 within the last 10 epochs. Our batch size is set to 128 for all settings. In the implementation of FAST-BAT, we follow the dataset-agnostic hyperparameter scheme for , such that = 255/5000 for = 8/255 and = 255/2500 for = 16/255 for CIFAR-10, CIFAR-100 and Tiny-Image Net. For Image Net, we strictly follow the setup given by (Wong et al., 2020) and we choose the train-time attack budget as = 2/255.