A DIRT-T Approach to Unsupervised Domain Adaptation

Authors: Rui Shu, Hung Bui, Hirokazu Narui, Stefano Ermon

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive empirical results demonstrate that the combination of these two models significantly improve the state-of-the-art performance on the digit, traffic sign, and Wi-Fi recognition domain adaptation benchmarks.
Researcher Affiliation Collaboration Stanford University Deep Mind {ruishu,hirokaz2,ermon}@stanford.edu {buih}@google.com
Pseudocode No No pseudocode or clearly labeled algorithm blocks were found. The methods are described using mathematical equations and textual explanations.
Open Source Code Yes 1Pronounce as dirty. Implementation available at https://github.com/Rui Shu/dirt-t
Open Datasets Yes We report results for domain adaptation in digits classification (MNIST-M, MNIST, SYN DIGITS, SVHN), traffic sign classification (SYN SIGNS, GTSRB), general object classification (STL-10, CIFAR-10), and Wi-Fi activity recognition (Yousefiet al., 2017).
Dataset Splits Yes For each task, we tuned the four hyperparameters (λd, λs, λt, β) by randomly selecting 1000 labeled target samples from the training set and using that as our validation set.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, cloud instances, memory) used for running experiments were provided. The paper does not mention any hardware specifications.
Software Dependencies No No specific software dependencies with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x) were provided. The paper mentions 'Adam Optimizer' and 'Leaky ReLU parameter a = 0.1' but not specific software environments.
Experiment Setup Yes Hyperparameters. For each task, we tuned the four hyperparameters (λd, λs, λt, β) by randomly selecting 1000 labeled target samples from the training set and using that as our validation set. A complete list of the hyperparameters is provided in Appendix B. VADA was trained for 80000 iterations and DIRT-T takes VADA as initialization and was trained for {20000, 40000, 60000, 80000} iterations, with number of iterations chosen as hyperparameter.