On Transfer of Adversarial Robustness from Pretraining to Downstream Tasks

Authors: Laura F. Nern, Harsh Raj, Maurice André Georgi, Yash Sharma

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our theoretical results in practical applications, show how our results can be used for calibrating expectations of downstream robustness, and when our results are useful for optimal transfer learning. Taken together, our results offer an initial step towards characterizing the requirements of the representation function for reliable post-adaptation performance.Our theoretical results presented in Sec. 3 and Sec. 4 are empirically studied in Sec. 5, where we validate the results in practice, and in Sec. 6 to show how the theory can be used for calibrating expectations of the downstream robustness in a self-supervised manner [4].
Researcher Affiliation Collaboration Laura F. Nern Yahoo Research laurafee.nern@yahooinc.com Harsh Raj Delhi Technological University harsh777111raj@gmail.com Maurice Georgi Hyundai Mobis georgimaurice@gmail.com Yash Sharma University of Tübingen yash.sharma@bethgelab.org
Pseudocode No The paper describes the theoretical framework and experimental procedures but does not include any explicitly labeled pseudocode or algorithm blocks. Methods are described through mathematical formulations and narrative text.
Open Source Code Yes 1https://github.com/lf-tcho/robustness_transfer
Open Datasets Yes We consider two models robustly pretrained on CIFAR-100 [1, 34] and Image Net [47, 17]. We investigate the performance of robustness transfer on CIFAR-10 [34], Fashion-MNIST [57] and Intel Image Classification3. We consider the d Sprites [41] dataset, which consists of 64x64 grayscale images of 2D shapes procedurally generated from 6 factors of variation (Fo Vs) which fully specify each sprite image: color, shape, scale, orientation, x-position (pos X) and y-position (pos Y).
Dataset Splits Yes We consider the d Sprites [41] dataset... The dataset consists of 737,280 images, of which we use an 80-20 train-test split.
Hardware Specification Yes The experiments were run on a single Ge Force RTX 3080 GPU and for each downstream task transfer learning or theory evaluation took between 30 minutes to 3 hours.
Software Dependencies No The paper mentions software like 'foolbox package [45]' for PGD attacks and 'stochastic gradient descent' but does not specify version numbers for any libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages, which are necessary for reproducible descriptions.
Experiment Setup Yes For the model pretrained on CIFAR-100, run 20 epochs of linear probing (LP) using a batch size of 128, and resize all input images to 32x32. For the model pretrained on Image Net, we run 10 epochs of LP using a batch size of 32, and resize all input images to 256x256. For LP, we set the learning rate using a cosine annealing schedule [39], with an initial learning rate of 0.01, and use stochastic gradient descent with a momentum of 0.9 on the cross-entropy loss. As hyperparameters for the L -PGD attack, we choose 20 steps and a relative step size of 0.7, since this setting yields highest adversarial sensitivity (AS) scores over all the tasks. The attack strength ϵ is set to ϵ = 8/255 for attacking the CIFAR-100 pretrained model and ϵ = 1/255 for the Image Net model.