Towards Accurate Model Selection in Deep Unsupervised Domain Adaptation
Authors: Kaichao You, Ximei Wang, Mingsheng Long, Michael Jordan
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct a series of experiments to empirically evaluate the proposed DEV approach. |
| Researcher Affiliation | Academia | 1School of Software 2BNRist, Research Center for Big Data, Tsinghua University, Beijing, China 3University of California, Berkeley, USA. |
| Pseudocode | Yes | The complete validation procedure, which is called Deep Embedded Validation (DEV), is described in Algorithms 1 and 2. |
| Open Source Code | Yes | The code of DEV is available at https://github.com/thuml/Deep-Embedded-Validation. |
| Open Datasets | Yes | Figures 1(a) and 1(b) show a toy regression data following the protocol of Sugiyama et al. (2007). [...] Vis DA (Peng et al., 2018) is a large-scale cross-domain dataset designed for domain adaptation in computer vision. [...] Office-31 (Saenko et al., 2010) is a standard dataset for visual domain adaptation. [...] Digits (Ganin et al., 2016) dataset consists of three domains: MNIST, USPS and SVHN. |
| Dataset Splits | Yes | Since DEV is carried out on the source data, we can split the source data into train/validation set before learning. That said, we use the hold-out validation method throughout all experiments. |
| Hardware Specification | No | The paper discusses deep learning models and architectures like ResNet-50, but it does not specify any hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or other libraries with their versions) that were used in the experiments. |
| Experiment Setup | Yes | With deep models, we try the following learning rates: 10 2, 10 2.5, 10 3, 10 3.5, 10 4 . Other hyperparameters are specified in their subsections respectively. [...] A key hyperparameter in MCD is the number of generator update iterations, denoted as k. We try k = 1, 2, 3, 4, 5 together with various learning rates to select the best model. [...] In CDAN, an important hyperparameter is the trade-off coefficient λ, which balances between the transferability and the discriminability of the learned representations. We implement several trade-offs (λ = 0.5, 0.75, 1.0, 1.25, 1.5, with λ = 1 as its default setting) along with several learning rate configurations. [...] Besides the learning rate, we also tune the hyperparameters α and β of GTA. [...] Besides the learning rate, we also tune the hyperparameter update iteration of PADA in {300, 400, 500, 600, 700} (with 500 as its default setting). |