Out of Thin Air: Exploring Data-Free Adversarial Robustness Distillation
Authors: Yuzheng Wang, Zhaoyu Chen, Dingkang Yang, Pinxue Guo, Kaixun Jiang, Wenqiang Zhang, Lizhe Qi
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed DFARD method on 32 32 CIFAR datasets (Krizhevsky, Hinton et al. 2009)... The robustness performances of our and other baseline methods are shown in Table 1. |
| Researcher Affiliation | Academia | 1Shanghai Engineering Research Center of AI & Robotics, Academy for Engineering & Technology, Fudan University 2Engineering Research Center of AI & Robotics, Ministry of Education, Academy for Engineering & Technology, Fudan University {yzwang20, zhaoyuchen20}@fudan.edu.cn |
| Pseudocode | Yes | Algorithm 1: Training process of our Data-Free Adversarial Robustness Distillation |
| Open Source Code | No | No explicit statement about providing open-source code for the described methodology or a direct link to a code repository was found. |
| Open Datasets | Yes | We evaluate the proposed DFARD method on 32 32 CIFAR datasets (Krizhevsky, Hinton et al. 2009) |
| Dataset Splits | No | The paper uses CIFAR datasets which have standard splits, but it does not explicitly state the train/validation/test split percentages, sample counts, or refer to predefined splits with specific details for reproducibility in its own experiments. |
| Hardware Specification | Yes | All models are trained on RTX 3090 GPUs (Paszke et al. 2019). |
| Software Dependencies | No | The paper states, 'Our proposed method and all others are implemented in Py Torch,' but does not provide a specific version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | The students are trained via SGD optimizer with cosine annealing learning rate with an initial value of 0.05, momentum of 0.9, and weight decay of 1e-4. The generators are trained via Adam optimizer with a learning rate of 1e-3, β1 of 0.5, β2 of 0.999. The distillation batch size and the synthesis batch size are both 256. The distillation epochs T is 200, the iterations of generator Tg is 1, and the iterations of student Ts is 5. Both the student model and the generator are randomly initialized. A 10-step PGD (PGD-10) with a random start size of 0.001 and step size of 2/255 is used to generate adversarial samples. The perturbation bounds are set to L norm ϵ = 8/255. The perturbation steps for PGDS, PGDT and CW are 20. |