Adversarial Training Helps Transfer Learning via Better Representations
Authors: Zhun Deng, Linjun Zhang, Kailas Vodrahalli, Kenji Kawaguchi, James Y. Zou
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We support our theories with experiments on popular data sets and deep learning architectures. Experiments. We perform empirical study of image classification. Our source tasks are image classification on Image Net [45]; our target tasks are image classification on CIFAR-10 [28]. |
| Researcher Affiliation | Academia | Zhun Deng Harvard University Cambridge, MA 02138 zhundeng@g.harvard.edu Linjun Zhang Rutgers University Piscataway, NJ 08854 linjun.zhang@rutgers.edu Kailas Vodrahalli Stanford University Stanford, CA 94025 kailasv@stanford.edu Kenji Kawaguchi Harvard University Cambridge, MA 02138 kkawaguchi@fas.harvard.edu James Zou Stanford University Stanford, CA 94025 jamesz@stanford.edu |
| Pseudocode | Yes | Algorithm 1 Learning for Linear Representations Input: {St}T +1 t=1 Step 1: Optimize the loss function on each individual source task t 2 [T] and obtain ˆβt = argminkβtk 1 i hβt, x(t) i y(t) i i/nt. Step 2: ˆW1 top-r SVD of [ˆβ1, ˆβ2, , ˆβT ]. Step 3: ˆw(T +1) 2 argminkw(T +1) 2 k 1 i=1 y(T +1) i hˆw(T +1) 2 , ˆW1x(T +1) i i/n T +1. Return ˆW1, ˆw(T +1) 2 |
| Open Source Code | No | The paper states: 'We use a public library for adversarial training [22].' with the URL 'https://github.com/Madry Lab/robustness.', but does not provide explicit access to the authors' own implementation code for the methodology described in the paper. |
| Open Datasets | Yes | Our source tasks are image classification on Image Net [45]; our target tasks are image classification on CIFAR-10 [28]. |
| Dataset Splits | No | The paper references the use of Image Net and CIFAR-10 datasets and discusses training and testing, but it does not explicitly specify the training/validation/test dataset splits (e.g., percentages or sample counts) beyond using CIFAR-10 as a target task. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'a public library for adversarial training [22]', which is identified as 'Robustness (python library)'. However, no specific version numbers for this library or other key software dependencies (e.g., Python, PyTorch/TensorFlow) are provided. |
| Experiment Setup | Yes | To simulate the pseudo-labeling setup, we sample 10% of Image Net, train a Res Net-18 model on this sample (without adversarial training), and generate pseudo-labels for the remaining 90%. We then train a new source model using all of the source labeled and pseudo-labeled data with and without adversarial training. We use a public library for adversarial training [22]. The high-level approach for adversarial training is as follows: at each iteration, take a small number of gradient steps to generate adversarial examples from an input batch; then update network weights using the loss gradients from the adversarial batch. |