Sparse-RS: A Versatile Framework for Query-Efficient Sparse Black-Box Adversarial Attacks
Authors: Francesco Croce, Maksym Andriushchenko, Naman D. Singh, Nicolas Flammarion, Matthias Hein6437-6445
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We propose a versatile framework based on random search, Sparse-RS, for score-based sparse targeted and untargeted attacks in the black-box setting. Sparse-RS does not rely on substitute models and achieves state-of-the-art success rate and query efficiency for multiple sparse attack models: l0bounded perturbations, adversarial patches, and adversarial frames. The l0-version of untargeted Sparse-RS outperforms all black-box and even all white-box attacks for different models on MNIST, CIFAR-10, and Image Net. Moreover, our untargeted Sparse-RS achieves very high success rates even for the challenging settings of 20 20 adversarial patches and 2-pixel wide adversarial frames for 224 224 images. Finally, we show that Sparse-RS can be applied to generate targeted universal adversarial patches where it significantly outperforms the existing approaches. Our code is available at https://github.com/fra31/sparse-rs. |
| Researcher Affiliation | Academia | Francesco Croce,1 Maksym Andriushchenko,2 Naman D. Singh,1 Nicolas Flammarion,2 Matthias Hein1 1 University of T ubingen 2 EPFL |
| Pseudocode | Yes | Algorithm 1: Sparse-RS |
| Open Source Code | Yes | Our code is available at https://github.com/fra31/sparse-rs. |
| Open Datasets | Yes | The l0-version of untargeted Sparse-RS outperforms all black-box and even all white-box attacks for different models on MNIST, CIFAR-10, and Image Net. We focus on attacking normally trained VGG-16-BN and Res Net-50 models on Image Net, which contains RGB images resized to shape 224 224, that is 50,176 pixels, belonging to 1,000 classes. |
| Dataset Splits | No | We evaluate the success rate on the initially correctly classified images out of 500 images from the validation set. While the paper mentions using a 'validation set' for evaluation, it does not provide specific details on the split percentages or sample counts for the training, validation, and test sets needed for full reproducibility of the data partitioning. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions software like 'VGG-16-BN and Res Net-50 models', but does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific library versions). |
| Experiment Setup | Yes | We consider perturbations of size k {50, 150} pixels to assess the effectiveness of the untargeted attacks at different thresholds with a limit of 10,000 queries. The quantity α(i) controls how much M differs from M and decays following a predetermined piecewise constant schedule rescaled according to the maximum number of queries N. We provide details about the algorithm, schedule, and values of αinit in App. A and B, and ablation studies for them in App. G. We evaluate on both models Corner Search with a budget of 50,000 queries and l0-RS with an equivalent budget of 10,000 queries and 5 random restarts. The loss in Alg. 1 is computed on a small batch of 30 training images and the initial locations M of the patch in each of the training images are sampled randomly. In order not to overfit on the training batch, we resample training images and locations of the patches (step 6 in Alg. 1) every 10k queries (total query budget 100k). |