Multi-Adversarial Domain Adaptation
Authors: Zhongyi Pei, Zhangjie Cao, Mingsheng Long, Jianmin Wang
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evidence demonstrates that the proposed model outperforms state of the art methods on standard domain adaptation datasets. |
| Researcher Affiliation | Academia | Zhongyi Pei, Zhangjie Cao, Mingsheng Long, Jianmin Wang KLiss, MOE; NEL-BDS; TNList; School of Software, Tsinghua University, China {peizhyi,caozhangjie14}@gmail.com {mingsheng,jimwang}@tsinghua.edu.cn |
| Pseudocode | No | The information is insufficient. The paper includes an architecture diagram (Figure 2) but no structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The codes, datasets and configurations will be available online at github.com/thuml. |
| Open Datasets | Yes | Office-31 (Saenko et al. 2010) is a standard benchmark for visual domain adaptation... and Image CLEF-DA1 is a benchmark dataset for Image CLEF 2014 domain adaptation challenge... 1http://imageclef.org/2014/adaptation |
| Dataset Splits | Yes | We follow standard evaluation protocols for unsupervised domain adaptation (Long et al. 2015; Ganin and Lempitsky 2015). For both Office-31 and Image CLEF-DA datasets, we use all labeled source examples and all unlabeled target examples. ... We also adopt transfer cross-validation (Zhong et al. 2010) to select parameter λ for the MADA models. |
| Hardware Specification | No | The information is insufficient. The paper does not specify any particular hardware components like GPU or CPU models used for running the experiments. |
| Software Dependencies | No | The information is insufficient. The paper mentions 'Caffe (Jia et al. 2014) framework' but does not specify its version or any other software dependencies with version numbers. |
| Experiment Setup | Yes | We employ the mini-batch stochastic gradient descent (SGD) with momentum of 0.9 and the learning rate strategy implemented in Rev Grad (Ganin and Lempitsky 2015): the learning rate is not selected by a grid search due to high computational cost it is adjusted during SGD using these formulas: ηp = η0 (1+αp)β , where p is the training progress linearly changing from 0 to 1, η0 = 0.01, α = 10 and β = 0.75... To suppress noisy activations at the early stages of training, instead of fixing parameter λ, we gradually change it by multiplying 2 1+exp( δp) 1, where δ = 10 (Ganin and Lempitsky 2015). |