Adversarial Training Helps Transfer Learning via Better Representations

Authors: Zhun Deng, Linjun Zhang, Kailas Vodrahalli, Kenji Kawaguchi, James Y. Zou

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We support our theories with experiments on popular data sets and deep learning architectures. Experiments. We perform empirical study of image classification. Our source tasks are image classification on Image Net [45]; our target tasks are image classification on CIFAR-10 [28].
Researcher Affiliation Academia Zhun Deng Harvard University Cambridge, MA 02138 zhundeng@g.harvard.edu Linjun Zhang Rutgers University Piscataway, NJ 08854 linjun.zhang@rutgers.edu Kailas Vodrahalli Stanford University Stanford, CA 94025 kailasv@stanford.edu Kenji Kawaguchi Harvard University Cambridge, MA 02138 kkawaguchi@fas.harvard.edu James Zou Stanford University Stanford, CA 94025 jamesz@stanford.edu
Pseudocode Yes Algorithm 1 Learning for Linear Representations Input: {St}T +1 t=1 Step 1: Optimize the loss function on each individual source task t 2 [T] and obtain ˆβt = argminkβtk 1 i hβt, x(t) i y(t) i i/nt. Step 2: ˆW1 top-r SVD of [ˆβ1, ˆβ2, , ˆβT ]. Step 3: ˆw(T +1) 2 argminkw(T +1) 2 k 1 i=1 y(T +1) i hˆw(T +1) 2 , ˆW1x(T +1) i i/n T +1. Return ˆW1, ˆw(T +1) 2
Open Source Code No The paper states: 'We use a public library for adversarial training [22].' with the URL 'https://github.com/Madry Lab/robustness.', but does not provide explicit access to the authors' own implementation code for the methodology described in the paper.
Open Datasets Yes Our source tasks are image classification on Image Net [45]; our target tasks are image classification on CIFAR-10 [28].
Dataset Splits No The paper references the use of Image Net and CIFAR-10 datasets and discusses training and testing, but it does not explicitly specify the training/validation/test dataset splits (e.g., percentages or sample counts) beyond using CIFAR-10 as a target task.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions using 'a public library for adversarial training [22]', which is identified as 'Robustness (python library)'. However, no specific version numbers for this library or other key software dependencies (e.g., Python, PyTorch/TensorFlow) are provided.
Experiment Setup Yes To simulate the pseudo-labeling setup, we sample 10% of Image Net, train a Res Net-18 model on this sample (without adversarial training), and generate pseudo-labels for the remaining 90%. We then train a new source model using all of the source labeled and pseudo-labeled data with and without adversarial training. We use a public library for adversarial training [22]. The high-level approach for adversarial training is as follows: at each iteration, take a small number of gradient steps to generate adversarial examples from an input batch; then update network weights using the loss gradients from the adversarial batch.