On the Vulnerability of Adversarially Trained Models Against Two-faced Attacks
Authors: Shengjie Zhou, Lue Tao, Yuzhou Cao, Tao Xiang, Bo An, Lei Feng
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To provide a comprehensive study on two-faced attacks, we verify the adversarial robustness of various network architectures (e.g., Res Net-18 and Wide Res Net-28-10) on multiple benchmark datasets including CIFAR-10, SVHN (Netzer et al., 2011), CIFAR-100 (Krizhevsky, 2009), and Tiny-Image Net (Yao et al., 2015), employing different adversarial training methods including PGDAT (Madry et al., 2018), TRADES (Zhang et al., 2019), MART (Wang et al., 2020b), FAT (Zhang et al., 2020), and THRM (Tao et al., 2022b). In addition, we validate various off-the-shelf adversarially trained models from Robust Bench (Croce et al., 2021), with diverse architectures and training strategies. We also evaluate the threat of two-faced attacks against different robustness verification methords and examine the transferability of two-faced attacks. All experimental results consistently show that these models exhibit higher robustness under two-faced attacks compared with their actual robustness, which demonstrates that adversarially trained models are indeed vulnerable to two-faced attacks, and such attacks may be widespread. |
| Researcher Affiliation | Collaboration | 1Chongqing University 2Nanjing University 3Nanyang Technological University 4Skywork AI |
| Pseudocode | Yes | Algorithm 1 Two-Faced Examples Generation |
| Open Source Code | No | The paper does not explicitly state that source code for their methodology is provided or publicly available. |
| Open Datasets | Yes | We verify the adversarial robustness of various network architectures (e.g., Res Net-18 and Wide Res Net-28-10) on multiple benchmark datasets including CIFAR-10, SVHN (Netzer et al., 2011), CIFAR-100 (Krizhevsky, 2009), and Tiny-Image Net (Yao et al., 2015)... |
| Dataset Splits | Yes | CIFAR-10 (Krizhevsky, 2009) is a widely used image dataset in computer vision research. It consists of 60,000 32 32 pixel color images (50,000 images for training and 10,000 images for testing)... We use the original test data set as the Clean validation data. |
| Hardware Specification | Yes | We conduct all experiments using NVIDIA GeForce RTX 3090 GPUs. |
| Software Dependencies | No | The paper mentions using 'PyTorch' but does not provide specific version numbers for software dependencies. |
| Experiment Setup | Yes | We train Res Net-18 (He et al., 2015), Dense Net-121 (Huang et al., 2016), and Wide Res Net-28-10 (Zagoruyko & Komodakis, 2016) models using SGD with a learning rate of 0.1, momentum of 0.9, and weight decay of 5 10 4. Additionally, the MLP and VGG-16 (Simonyan & Zisserman, 2014) models are trained with SGD using a learning rate of 0.01, momentum of 0.9, and weight decay of 5 10 4. For all architectures, the training epoch is fixed at 110 with batch size 128 and learning rate was decayed by a factor of 0.1 in the 100-th epoch and the 105-th epoch respectively. |