Adversarial Robustness through Random Weight Sampling
Authors: Yanxiang Ma, Minjing Dong, Chang Xu
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the effectiveness of CTRW on several datasets and benchmark convolutional neural networks. Our results indicate that our model achieves a robust accuracy approximately 16% to 17% higher than the baseline model under PGD-20 and 22% to 25% higher on Auto Attack. |
| Researcher Affiliation | Academia | Yanxiang Ma, Minjing Dong, Chang Xu School of Computer Science, University of Sydney {yama9404, mdon0736}@uni.sydney.edu.au c.xu@sydney.edu.au |
| Pseudocode | Yes | Algorithm 1 Adversarial Training under Constrain Using Black-Box Attack |
| Open Source Code | No | The paper does not include an unambiguous statement about releasing the source code for the work described, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | We evaluate CTRW on CIFAR [38] and Imagenet [39]. |
| Dataset Splits | No | The paper mentions 'Each CIFAR dataset contains 5.0 104 training examples and 1.0 104 test examples' and 'The Imagenet dataset consists of 1.2 106 training examples and 5.0 104 test examples', but does not explicitly provide details for a validation split or its proportion, nor does it specify how these standard splits are handled regarding validation. |
| Hardware Specification | Yes | The model is implemented using Py Torch [40] and trained and evaluated on a single NVIDIA Ge Force RTX 4090 GPU. |
| Software Dependencies | No | The paper mentions 'Py Torch [40]' as the implementation framework but does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | In the experiment, the dataset is split into batches of size 128, and the weight decay is set to 5.0 10 4. We use an SGD optimizer with a momentum of 0.9. The learning rate is initially set to 0.1 and reduced by a multi-step scheduler. The network is trained for 200 epochs, with the learning rate reduced by a factor of 10 at epochs 100 and 150. For adversarial training, we set ϵd to 8 255 and the step length η to 2 255 for a 10-step PGD [9]. |