Mutual-Modality Adversarial Attack with Semantic Perturbation
Authors: Jingwen Ye, Ruonan Yu, Songhua Liu, Xinchao Wang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on several benchmark datasets and demonstrate that our mutual-modal attack strategy can effectively produce high-transferable attacks, which are stable regardless of the target networks. Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution. |
| Researcher Affiliation | Academia | Jingwen Ye, Ruonan Yu, Songhua Liu, Xinchao Wang National University of Singapore jingweny@nus.edu.sg, {ruonan,songhua.liu}@u.nus.sg, xinchao@nus.edu.sg |
| Pseudocode | No | with the iterative training of G and P, we obtain the final generative perturbation network G, the whole algorithm is given in the supplementary. |
| Open Source Code | No | The paper does not include an unambiguous statement of code release or a direct link to a source code repository for the methodology described. |
| Open Datasets | Yes | We evaluate attacks using two popular datasets in adversarial examples research, which are the CIFAR-10 dataset (Krizhevsky 2009) and the Image Net dataset (Russakovsky et al. 2014). |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, and testing. While Table 1 mentions 'Train/ Val', the specific proportions are not defined. |
| Hardware Specification | Yes | A total of 10 iterations (we set the NUMG to be 2) are used to train the whole network, which costs about 8 hours on one NVIDIA Ge Force RTX 3090 GPUs. |
| Software Dependencies | No | The paper mentions 'We used Py Torch framework for the implementation' but does not specify its version number or any other software dependencies with their respective versions. |
| Experiment Setup | Yes | We used Py Torch framework for the implementation. In the normal setting of using the pre-trained CLIP as the surrogate model, we choose the Vi T/32 as backbone. As for the generator, we choose to use the Res Net backbone, and set the learning rate to be 0.0001 with Adam optimizer. All images are scaled to 224 224 to train the generator. For the ℓ bound, we set ϵ = 0.04. A total of 10 iterations (we set the NUMG to be 2) are used to train the whole network... |