CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks
Authors: Shashank Agnihotri, Steffen Jung, Margret Keuper
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide theoretical and empirical proofs for the stability and spatial balancing of Cos PGD during attack optimization. For semantic segmentation, we compare Cos PGD to the recently proposed Seg PGD which also uses pixelwise information for generating attacks. Cos PGD outperforms Seg PGD by a significant margin. To demonstrate Cos PGD s versatility, we also evaluate it as a targeted attack and as a non-targeted attack, for both ℓ2 and ℓ bounds on semantic segmentation, optical flow estimation and image restoration in several settings and datasets. 5. Experiments To demonstrate the wide applicability of Cos PGD, we conduct our experiments on distinct downstream tasks: semantic segmentation, optical flow estimation, and image restoration. |
| Researcher Affiliation | Academia | 1Data and Web Science Group, University of Mannheim, Germany 2Max-Planck-Institute for Informatics, Saarland Informatics Campus, Germany. |
| Pseudocode | Yes | Following we present the algorithm for Cos PGD. Algorithm 1 provides a general overview of the implementation of Cos PGD. |
| Open Source Code | Yes | We provide code for the Cos PGD algorithm and example usage at https://github. com/shashankskagnihotri/cospgd. |
| Open Datasets | Yes | We use PASCAL VOC 2012 (Everingham et al., 2012), which contains 20 object classes and one background class, with 1464 training images, and 1449 validation images. We do these evaluations on the Cityscapes dataset (Cordts et al., 2016). Evaluations are performed on KITTI2015 (Menze & Geiger, 2015) and MPI Sintel (Butler et al., 2012; Wulff et al., 2012) validation sets. |
| Dataset Splits | Yes | We use PASCAL VOC 2012 (Everingham et al., 2012), which contains 20 object classes and one background class, with 1464 training images, and 1449 validation images. Cityscapes contains a total of 5000 high-quality images and pixel-wise annotations for urban scene understanding. The dataset is split into 2975, 500, and 1525 images for training, validation, and testing respectively. This dataset has 150 classes and is split into 25,574 training images and 2,000 validation images. |
| Hardware Specification | Yes | For the experiments on Deep Lab V3, we used NVIDIA Quadro RTX 8000 GPUs. For PSPNet, we used NVIDIA A100 GPUs. For the experiments with UNet, we used NVIDIA Ge Force RTX 3090 GPUs. We used NVIDIA V100 GPUs, a single GPU was used for each run. For the experiments on Image de-blurring tasks, we used NVIDIA Ge Force RTX 3090 GPUs. |
| Software Dependencies | No | The paper mentions various models and datasets used in the experiments but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | For α, we follow (Gu et al., 2022) and set the step size to α = 0.01 (please refer to Appendix B.6 for an ablation study). For the l2-norm constraint we follow common work (Croce et al., 2020; Wang et al., 2023) and use the same ϵ for Cos PGD, Seg PGD, and PGD i.e. ϵ { 64 255} and α ={0.1, 0.2}. We consider α {0.005, 0.01, 0.02, 0.04, 0.1}. |