MORA: Improving Ensemble Robustness Evaluation with Model Reweighing Attack

Authors: yunrui yu, Xitong Gao, Cheng-Zhong Xu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comparing it against recent SOTA white-box attacks, it can converge orders of magnitude faster while achieving higher attack success rates across all ensemble models examined with three different ensemble modes (i.e., ensembling by either softmax, voting or logits). In particular, most ensemble defenses exhibit near or exactly 0% robustness against MORA with 1 perturbation within 0.02 on CIFAR-10, and 0.01 on CIFAR-100. We make MORA open source with reproducible results and pre-trained models; and provide a leaderboard of ensemble defenses under various attack strategies1.
Researcher Affiliation Academia State Key Lab of IOTSC, University of Macau Macau SAR, China yb97445@um.edu.mo Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences Shenzhen, China xt.gao@siat.ac.cn Cheng-Zhong Xu State Key Lab of IOTSC, University of Macau Macau SAR, China czxu@um.edu.mo
Pseudocode Yes We provide the overall algorithm in Algorithm 1, which computes an adversarial image ˆx I as return, by taking as input the sub-models f[1:M], natural image x, ground truth label y, β to interpolate between the auxiliary logits and the original, controls the temperature, momentum µ = 0.75 following [32, 5], perturbation bound, and finally the maximum number of iterations I.
Open Source Code Yes We make MORA open source with reproducible results and pre-trained models; and provide a leaderboard of ensemble defenses under various attack strategies1. 1https://github.com/lafeat/mora.
Open Datasets Yes Our robustness evaluation considers the 1 white-box attacks on the CIFAR-10 test set [15], with perturbation = 0.01 unless specified. [15] Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. The CIFAR-10 and CIFAR-100 datasets, 2014. Available at: http://www.cs.toronto.edu/~kriz/cifar.html.
Dataset Splits Yes Our robustness evaluation considers the 1 white-box attacks on the CIFAR-10 test set [15], with perturbation = 0.01 unless specified. The CIFAR-10 dataset is a standard benchmark dataset with well-defined training, validation, and test splits commonly used in the field.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for experiments were mentioned.
Software Dependencies No The paper does not provide specific version numbers for software dependencies.
Experiment Setup Yes Our robustness evaluation considers the 1 white-box attacks on the CIFAR-10 test set [15], with perturbation = 0.01 unless specified. PGD uses a fixed step size of /4. For a fair comparison, MORA with 500 iterations sweeps β 2 {0, 0.25, 0.5, 0.75, 1}, with each β up to 100 iterations. momentum µ = 0.75 following [32, 5]. we used = 0.1 universally (for softwta).