Transferable Adversarial Training: A General Approach to Adapting Deep Classifiers

Authors: Hong Liu, Mingsheng Long, Jianmin Wang, Michael Jordan

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental A series of experiments validate that our approach advances the state of the arts on a variety of domain adaptation tasks in vision and NLP, including object recognition, learning from synthetic to real data, and sentiment classification.
Researcher Affiliation Academia Hong Liu 1 2 Mingsheng Long 1 3 Jianmin Wang 1 3 Michael I. Jordan 4 1School of Software 2Department of Electronic Engineering 3BNRist, Research Center for Big Data, Tsinghua University, Beijing, China 4University of California, Berkeley, USA.
Pseudocode Yes We summarize the detailed training procedure in Algorithm 1. TAT runs over the feature-level examples f and propagates only through the deep classifier C (usually of no more than three layers), which is very computationally efficient (an order of magnitude faster than feature adaptation methods).
Open Source Code Yes 5. Experiments We evaluate TAT on five domain adaptation datasets. Codes and datasets are made available at github.com/thuml/ Transferable-Adversarial-Training.
Open Datasets Yes 5. Experiments We evaluate TAT on five domain adaptation datasets. Codes and datasets are made available at github.com/thuml/ Transferable-Adversarial-Training.
Dataset Splits Yes We use reverse validation for hyperparameter selection (Zhong et al., 2010).
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models or memory used for running experiments.
Software Dependencies No The paper mentions using Adam, ResNet-50, and scikit-learn, but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes For image datasets, we use Res Net-50 (He et al., 2016) pretrained on Image Net (Russakovsky et al., 2015) to extract original feature representations. We use Adam (Kingma & Ba, 2014) with initial learning rate η0 = 10 4. We adopt the inverse-decay strategy of DANN (Ganin et al., 2016), where the learning rate changes by ηp = η0 (1+ωp)φ , ω = 10, φ = 0.75, and p is the progress ranging from 0 to 1. For image datasets, β = 5 and γ = 1.