Out of Thin Air: Exploring Data-Free Adversarial Robustness Distillation

Authors: Yuzheng Wang, Zhaoyu Chen, Dingkang Yang, Pinxue Guo, Kaixun Jiang, Wenqiang Zhang, Lizhe Qi

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the proposed DFARD method on 32 32 CIFAR datasets (Krizhevsky, Hinton et al. 2009)... The robustness performances of our and other baseline methods are shown in Table 1.
Researcher Affiliation Academia 1Shanghai Engineering Research Center of AI & Robotics, Academy for Engineering & Technology, Fudan University 2Engineering Research Center of AI & Robotics, Ministry of Education, Academy for Engineering & Technology, Fudan University {yzwang20, zhaoyuchen20}@fudan.edu.cn
Pseudocode Yes Algorithm 1: Training process of our Data-Free Adversarial Robustness Distillation
Open Source Code No No explicit statement about providing open-source code for the described methodology or a direct link to a code repository was found.
Open Datasets Yes We evaluate the proposed DFARD method on 32 32 CIFAR datasets (Krizhevsky, Hinton et al. 2009)
Dataset Splits No The paper uses CIFAR datasets which have standard splits, but it does not explicitly state the train/validation/test split percentages, sample counts, or refer to predefined splits with specific details for reproducibility in its own experiments.
Hardware Specification Yes All models are trained on RTX 3090 GPUs (Paszke et al. 2019).
Software Dependencies No The paper states, 'Our proposed method and all others are implemented in Py Torch,' but does not provide a specific version number for PyTorch or any other software dependencies.
Experiment Setup Yes The students are trained via SGD optimizer with cosine annealing learning rate with an initial value of 0.05, momentum of 0.9, and weight decay of 1e-4. The generators are trained via Adam optimizer with a learning rate of 1e-3, β1 of 0.5, β2 of 0.999. The distillation batch size and the synthesis batch size are both 256. The distillation epochs T is 200, the iterations of generator Tg is 1, and the iterations of student Ts is 5. Both the student model and the generator are randomly initialized. A 10-step PGD (PGD-10) with a random start size of 0.001 and step size of 2/255 is used to generate adversarial samples. The perturbation bounds are set to L norm ϵ = 8/255. The perturbation steps for PGDS, PGDT and CW are 20.