Bridging Theory and Algorithm for Domain Adaptation

Authors: Yuchen Zhang, Tianle Liu, Mingsheng Long, Michael Jordan

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental A series of empirical studies show that our algorithm achieves the state of the art accuracies on challenging domain adaptation tasks. We evaluate the proposed learning method on three datasets against state of the art deep domain adaptation methods.
Researcher Affiliation Academia 1School of Software 2Research Center for Big Data, BNRist 3Department of Mathematical Science, Tsinghua University, China 4University of California, Berkeley, USA.
Pseudocode No The paper includes an adversarial network diagram (Figure 1) but does not provide structured pseudocode or algorithm blocks.
Open Source Code Yes The code is available at github.com/thuml/MDD.
Open Datasets Yes Office-31 (Saenko et al., 2010) is a standard domain adaptation dataset... Office-Home (Venkateswara et al., 2017) is a more complex dataset... Vis DA-2017 (Peng et al., 2017) is simulation-to-real dataset...
Dataset Splits Yes We follow the commonly used experimental protocol for unsupervised domain adaptation from Ganin & Lempitsky (2015); Long et al. (2018). We report the average accuracies of five independent experiments. The importance-weighted cross-validation (IWCV) is employed in all experiments for the selection of hyper-parameters.
Hardware Specification No The paper does not provide specific hardware details such as CPU/GPU models, memory, or cloud instance types used for running experiments.
Software Dependencies No The paper mentions "We implement our algorithm in Py Torch." but does not specify a version number for PyTorch or other software dependencies.
Experiment Setup Yes The asymptotic value of coefficient η is fixed to 0.1 and γ is chosen from {2, 3, 4} and kept the same for all tasks on the same dataset. ... The main classifier and auxiliary classifier are both 2-layer neural networks with width 1024. For optimization, we use the mini-batch SGD with the Nesterov momentum 0.9. The learning rate of the classifiers are set 10 times to that of the feature extractor, the value of which is adjusted according to Ganin et al. (2016).