PubDef: Defending Against Transfer Attacks From Public Models
Authors: Chawin Sitawarin, Jaewon Chang, David Huang, Wesson Altoyan, David Wagner
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the transfer attacks in this setting and propose a specialized defense method based on a game-theoretic perspective. The defenses are evaluated under 24 public models and 11 attack algorithms across three datasets (CIFAR10, CIFAR-100, and Image Net). Under this threat model, our defense, PUBDEF, outperforms the state-of-the-art white-box adversarial training by a large margin with almost no loss in the normal accuracy. For instance, on Image Net, our defense achieves 62% accuracy under the strongest transfer attack vs only 36% of the best adversarially trained model. Its accuracy when not under attack is only 2% lower than that of an undefended model (78% vs 80%). |
| Researcher Affiliation | Collaboration | Chawin Sitawarin UC Berkeley Jaewon Chang UC Berkeley David Huang* UC Berkeley Wesson Altoyan King Abdulaziz City for Science and Technology David Wagner UC Berkeley |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Code is available here. |
| Open Datasets | Yes | The defenses are evaluated under 24 public models and 11 attack algorithms across three datasets (CIFAR10, CIFAR-100, and Image Net). |
| Dataset Splits | Yes | The clean accuracy is simply the accuracy on the test set, with no attack. |
| Hardware Specification | Yes | All of the models are trained on Nvidia A100 GPUs. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies. |
| Experiment Setup | Yes | All CIFAR-10/100 models are trained for 200 epochs with a learning rate of 0.1, weight decay of 5e-4, and a batch size of 2048. Image Net models are trained for 50 epochs with a learning rate of 0.1, weight decay of 1e-4, and a batch size of 512. |