Towards Accurate Model Selection in Deep Unsupervised Domain Adaptation

Authors: Kaichao You, Ximei Wang, Mingsheng Long, Michael Jordan

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we conduct a series of experiments to empirically evaluate the proposed DEV approach.
Researcher Affiliation Academia 1School of Software 2BNRist, Research Center for Big Data, Tsinghua University, Beijing, China 3University of California, Berkeley, USA.
Pseudocode Yes The complete validation procedure, which is called Deep Embedded Validation (DEV), is described in Algorithms 1 and 2.
Open Source Code Yes The code of DEV is available at https://github.com/thuml/Deep-Embedded-Validation.
Open Datasets Yes Figures 1(a) and 1(b) show a toy regression data following the protocol of Sugiyama et al. (2007). [...] Vis DA (Peng et al., 2018) is a large-scale cross-domain dataset designed for domain adaptation in computer vision. [...] Office-31 (Saenko et al., 2010) is a standard dataset for visual domain adaptation. [...] Digits (Ganin et al., 2016) dataset consists of three domains: MNIST, USPS and SVHN.
Dataset Splits Yes Since DEV is carried out on the source data, we can split the source data into train/validation set before learning. That said, we use the hold-out validation method throughout all experiments.
Hardware Specification No The paper discusses deep learning models and architectures like ResNet-50, but it does not specify any hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or other libraries with their versions) that were used in the experiments.
Experiment Setup Yes With deep models, we try the following learning rates: 10 2, 10 2.5, 10 3, 10 3.5, 10 4 . Other hyperparameters are specified in their subsections respectively. [...] A key hyperparameter in MCD is the number of generator update iterations, denoted as k. We try k = 1, 2, 3, 4, 5 together with various learning rates to select the best model. [...] In CDAN, an important hyperparameter is the trade-off coefficient λ, which balances between the transferability and the discriminability of the learned representations. We implement several trade-offs (λ = 0.5, 0.75, 1.0, 1.25, 1.5, with λ = 1 as its default setting) along with several learning rate configurations. [...] Besides the learning rate, we also tune the hyperparameters α and β of GTA. [...] Besides the learning rate, we also tune the hyperparameter update iteration of PADA in {300, 400, 500, 600, 700} (with 500 as its default setting).