ToAlign: Task-Oriented Alignment for Unsupervised Domain Adaptation

Authors: Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, Zhibo Chen

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on various benchmarks (e.g., Offce-Home, Visda 2017, and Domain Net) under different domain adaptation settings demonstrate the effectiveness of To Align which helps achieve the state-of-the-art performance.
Researcher Affiliation Collaboration 1 University of Science and Technology of China 2 Microsoft Research Asia
Pseudocode No The paper describes the methods and procedures using text and equations, but no explicitly labeled 'Pseudocode' or 'Algorithm' block is provided.
Open Source Code Yes The code is publicly available at https://github.com/microsoft/UDA.
Open Datasets Yes We use two commonly used benchmark datasets (i.e., Offce-Home [60] and Vis DA 2017 [46]) for SUDA and a large-scale dataset Domain Net [43] for MUDA and SSDA.
Dataset Splits No The paper describes the use of labeled source and unlabeled target data for different adaptation settings (SUDA, MUDA, SSDA) and mentions specific dataset configurations (e.g., one-shot/three-shot for SSDA), but it does not provide explicit training, validation, and test dataset split percentages or sample counts to reproduce the data partitioning.
Hardware Specification Yes Table 4: Training complexity comparison (on GTX TITAN X GPU) in terms of computational time (of one iteration) and GPU memory for a mini-batch with batch size 32.
Software Dependencies No The paper describes the network architecture and training details but does not provide specific software dependency versions (e.g., Python, PyTorch, or library versions).
Experiment Setup Yes We use the Res Net-50 [21] pre-trained on Image Net [30] as the backbone for SUDA, while using Res Net-101 and Res Net-34 for MUDA and SSDA respectively. Following [64, 40, 12], the image classifer C is composed of one fully connected layer. The discriminator D consists of three fully connected layers with inserted dropout and Re LU layers. We follow [69] to take an annealing strategy η0 to set the learning rate η, i.e., ηt = , where p indicates the progress of training that increases (1+γp)τ linearly from 0 to 1, γ = 10, and τ = 0.75. The initial learning rate η0 is set to 1e 3, 3e 4, 3e 4, and 1e 3 for SUDA on Offce-Home, SUDA on Vis DA-2017, MSDA on Domain Net, and SSDA on Domain Net, respectively.