Adversarially robust transfer learning
Authors: Ali Shafahi, Parsa Saadatpanah, Chen Zhu, Amin Ghiasi, Christoph Studer, David Jacobs, Tom Goldstein
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To demonstrate the power of robust transfer learning, we transfer a robust Image Net source model onto the CIFAR domain, achieving both high accuracy and robustness in the new domain without adversarial training. We use visualization methods to explore properties of robust feature extractors. |
| Researcher Affiliation | Academia | Ali Shafahi , Parsa Saadatpanah , Chen Zhu , Amin Ghiasi , Cristoph Studer , {ashafahi,parsa,chenzhu,amin}@cs.umd.edu ; studer@cornell.edu David Jacobs , Tom Goldstein {djacobs,tomg}@cs.umd.edu |
| Pseudocode | No | The paper describes algorithms and procedures in prose and through mathematical formulations but does not include any distinct pseudocode blocks or formally labeled algorithm sections. |
| Open Source Code | Yes | Source code for LwF-based experiments: https://github.com/ashafahi/RobustTransferLWF |
| Open Datasets | Yes | We use models trained on CIFAR-100 as source models and perform transfer learning from CIFAR-100 to CIFAR-10. ...Image Net (Russakovsky et al., 2015)... CIFAR-10 (Krizhevsky & Hinton, 2009)... |
| Dataset Splits | Yes | We train for 20,000 iterations using Momentum SGD and a learning rate of 0.001. We then incrementally unfreeze and train more blocks. For each experiment, we evaluate the newly trained model s accuracy on validation adversarial examples built with a 20-step PGD ℓ attack with ϵ = 8. ...To evaluate our method on two datasets with more similar attributes, we randomly partition CIFAR-100 into two disjoint subsets where each subset contains images corresponding to 50 classes. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU specifications, or memory configurations used for running the experiments. It only generally refers to 'computing power'. |
| Software Dependencies | No | The paper mentions training with 'Momentum SGD' and implicitly uses deep learning frameworks, but it does not specify any software dependencies (e.g., PyTorch, TensorFlow) with their version numbers required to replicate the experiments. |
| Experiment Setup | Yes | We train for 20,000 iterations using Momentum SGD and a learning rate of 0.001. ...We adv. train the WRN 32-10 on CIFAR-100 using a 7-step ℓ PGD attack with step-size=2 and ϵ = 8. We train for 80,000 iterations with a batch-size of 128. ...In our LWF-based experiments, we use a batch-size of 128, a fixed learning-rate of 1e-2m, and fine-tune for an additional 20,000 iterations. The first 10,000 iterations are used for warm-start; during which we only update the final fully connected layer s weights. During the remaining 10,000 iterations, we update all of the weights but do not update the batch-normalization parameters. |